Affinity peptides and method for purification of recombinant proteins

ABSTRACT

This invention describes a process for separating a fusion protein or polypeptide in the form of its precursor from a mixture containing said fusion protein and impurities, which comprises contacting said fusion protein with a resin containing immobilized metal ions, said fusion protein covalently operably linked directly or indirectly to an immobilized metal ion-affinity peptide, binding said fusion protein to said resin, and selectively eluting said fusion protein from said resin.

REFERENCE TO RELATED APPLICATION

[0001] This application is a non-provisional application claimingpriority from provisional application Serial No. 60/388,059, filed Jun.12, 2002.

FIELD OF THE INVENTION

[0002] This invention relates to affinity peptides, fusion proteinscontaining affinity peptides, genes coding for such proteins, expressionvectors and transformed microorganisms containing such genes, andmethods for the purification of the fusion proteins.

BACKGROUND OF THE INVENTION

[0003] The possibility of preparing hybrid genes by gene technology hasopened up new routes for the analysis of recombinant proteins. Bylinking the coding gene sequence of a desired protein to the coding genesequence of a protein fragment having a high affinity for a ligand(affinity peptide), it is possible to purify desired recombinantproteins in the form of fusion proteins in one-step using the affinitypeptide.

[0004] Immobilized metal affinity chromatography (IMAC), also known asmetal chelate affinity chromatography (MCAC), is a specialized aspect ofaffinity chromatography. The principle behind IMAC lies in the fact thatmany transition metal ions, e.g., nickel, zinc and copper, cancoordinate to the amino acids histidine, cysteine, and tryptophan viaelectron donor groups on the amino acid side chains. To utilize thisinteraction for chromatographic purposes, the metal ion is typicallyimmobilized onto an insoluble support. This can be done by attaching achelating group to the chromatographic matrix. Most importantly, to beuseful, the metal of choice must have a higher affinity for the matrixthan for the compounds to be purified.

[0005] In U.S. Pat. No. 4,569,794, Smith et al. disclose the preparationof a fusion protein containing a metal ion-affinity peptide linker and abiologically active polypeptide, expressing the fusion protein, andpurifying it using immobilized metal ion chromatography. Becauseessentially any biologically active polypeptide could be used, thisapproach enabled the convenient expression and purification ofessentially biologically active polypeptide by immobilized metal ionchromatography.

[0006] In U.S. Pat. Nos. 5,310,663 and 5,284,933, Dobeli et al. disclosea process for separating a biologically active polypeptide fromimpurities by producing the desired polypeptide as a fusion proteincontaining a metal ion-affinity peptide linker comprising 2 to 6adjacent histidine residues. Although Dobeli et al.'s metal ion-affinitypeptide provides greater metal affinity relative to certain of thesequences disclosed by Smith et al., there is some cautionary evidencethat proteins containing His-tags may differ from their wild-typecounterparts in dimerization/oligomerization properties. For example, Wuand Filutowicz present evidence that the biochemical properties of thepi(30.5) protein of plasmid R6K, a DNA binding protein, werefundamentally altered due to the presence of an N-terminal 6×His-tag.Wu, J. and Filutowicz, M., Acta Biochim. Pol., 46:591-599, 1999. Inaddition, Rodriguez-Viciana et al. stated that V12 Ras proteinsexpressed as histidine-tagged fusion proteins exhibited poor biologicalactivity. Rodriguez-Viciana, P., et al., Cell, 89:457-67,1997.

SUMMARY OF THE INVENTION

[0007] One aspect of the present invention is a peptide which isrelatively hydrophilic, is capable of exhibiting appropriate biologicalactivity, and has a relatively high affinity for coordinating metals.Advantageously, this metal ion-affinity peptide may be incorporated intoa fusion protein to enable ready purification of the fusion protein fromaqueous solutions by immobilized metal affinity chromatography. Inaddition to the metal ion-affinity peptide, the fusion protein typicallycomprises a protein or polypeptide of interest, covalently linked,directly or indirectly, to the metal ion-affinity peptide.

[0008] Briefly, therefore, the present invention is directed to apeptide represented by the formulaR₁-Sp₁-(His-Z₁-His-Arg-His-Z₂-His)-Sp₂-R₂, wherein(His-Z₁-His-Arg-His-Z₂-His) is a metal ion-affinity peptide, R₁ ishydrogen, a polypeptide, protein or protein fragment, Sp₁ is a covalentbond or a spacer comprising at least one amino acid residue, R₂ ishydrogen, a polypeptide, protein or protein fragment, Sp₂ is a covalentbond or a spacer comprising at least one amino acid residue, Z₁ is anamino acid residue selected from the group consisting of Ala, Arg, Asn,Asp, Gln, Glu, Ile, Lys, Phe, Pro, Ser, Thr, Trp, and Val; and Z₂ is anamino acid residue selected from the group consisting of Ala, Arg, Asn,Asp, Cys, Gln, Glu, Gly, Ile, Leu, Lys, Met, Pro, Ser, Thr, Tyr, andVal.

[0009] The present invention is further directed to a process forseparating a recombinant protein or polypeptide from a liquid mixturewherein the recombinant protein or polypeptide comprises a metalion-affinity peptide having the sequence His-Z₁-His-Arg-His-Z₂-His andZ₁ and Z₂ are as previously defined. In the process, the mixture iscombined with a solid support having immobilized metal ions to bind therecombinant protein or polypeptide, and eluting the fusion protein fromthe solid support.

[0010] The present invention is further directed to vectors and hostcells for recombinant expression of the nucleic acid molecules describedherein, as well as methods of making such vectors and host cells and forusing them for production of the polypeptides or peptides of the presentinvention by recombinant techniques.

[0011] The present invention is further directed to a kit for theexpression and/or separation of the recombinant proteins or polypeptidesfrom a mixture wherein the recombinant proteins or polypeptides containthe sequence R₁-Sp₁-(His-Z₁His-Arg-His-Z₂-His)-Sp₂-R₂, and R₁, R₂, Sp₁,Sp₂, Z₁ and Z₂ are as previously defined. The kit may comprise, inseparate containers, the nucleic acid components to be assembled into avector encoding for a fusion protein comprising a protein or polypeptidecovalently operably linked directly or indirectly to an immobilizedmetal ion-affinity peptide. In addition, or alternatively, the kit maybe comprised of one or more of the following: buffers, enzymes, achromatography column comprising a resin containing immobilized metalions and an instructional brochure explaining how to use the kit.

[0012] Other objects and advantages of the present invention will becomeapparent as the detailed description of the invention proceeds.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0013] The present invention generally relates to the expression andpurification of recombinant polypeptides, proteins or protein fragmentscontaining a metal ion-affinity peptide. In addition to the metalion-affinity peptide, the recombinant polypeptides and proteins willtypically also contain a target polypeptide, protein or fragment thereofcovalently linked to the metal ion-affinity peptide. In one embodiment,the target polypeptide, protein or protein fragment is a biologicallyactive protein or protein fragment. Advantageously, the metalion-affinity peptide enables the recombinant polypeptides and proteinsto be readily purified from a liquid sample by means of metal ionaffinity chromatography.

[0014] The fusion proteins of this invention are prepared by recombinantDNA methodology. In accordance with the present invention, a genesequence coding for a desired protein is isolated, synthesized orotherwise obtained and operably linked to a DNA sequence coding for themetal ion-affinity peptide. The hybrid gene containing the gene for adesired protein operably linked to a DNA sequence encoding the metalion-affinity peptide is referred to as a chimeric gene.

[0015] In one embodiment, the metal ion-affinity peptide is covalentlylinked to the carboxy terminus of the target polypeptide, protein orprotein fragment. In another embodiment, the metal ion-affinity peptideis covalently linked to the amino terminus of the target polypeptide,protein or protein fragment. In each of these embodiments, the metalion-affinity peptide and the target polypeptide, protein or proteinfragment may be directly attached by means of a peptide bond or,alternatively, the two may be separated by a linker. When present, thelinker may provide other functionality to the recombinant polypeptide,protein or protein fragment.

[0016] The recombinant polypeptides, proteins or protein fragments ofthe present invention are defined by the general formula (I):

R₁-Sp₁-(His-Z₁-His-Arg-His-Z₂-His)-Sp₂-R₂ (I)

[0017] wherein (His-Z₁-His-Arg-His-Z₂-His) is a metal ion-affinitypeptide; Z₁ is an amino acid residue selected from the group consistingof Ala, Arg, Asn, Asp, Gln, Glu, Ile, Lys, Phe, Pro, Ser, Thr, Trp, andVal; and Z₂ is an amino acid residue selected from the group consistingof Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, Ile, Leu, Lys, Met, Pro, Ser,Thr, Tyr and Val. In addition, R₁ is hydrogen, a polypeptide, protein orprotein fragment, Sp₁ is a covalent bond or a spacer comprising at leastone amino acid residue, R₂ is hydrogen, a polypeptide, protein orprotein fragment, Sp₂ is a covalent bond or a spacer comprising at leastone amino acid residue. Thus, for example, R₁ or R₂ may comprise atarget polypeptide, protein, or protein fragment which is directly orindirectly linked to the metal ion-affinity peptide.

[0018] Metal Ion-Affinity Peptide

[0019] In one embodiment, the recombinant polypeptide, protein orprotein fragment is defined by formula (I), wherein Z₁ is an amino acidselected from the group consisting of Ala, Asn, Ile, Lys, Phe, Ser, Thr,and Val; and Z₂ is an amino acid selected from the group consisting ofAla, Asn, Gly, Lys, Ser, Thr, Tyr; and R₁, R₂, Sp₁, and Sp₂ are aspreviously defined. Thus, for example, in this embodiment the targetpolypeptide, protein or protein fragment (R₁ or R₂) may be at thecarboxy or amino terminus of the metal ion-affinity polypeptide. Inaddition, the target polypeptide, protein or protein fragment (R₁ orR₂), may be directly fused (when Sp₁ or Sp₂ is a covalent bond) orseparated from the metal ion-affinity polypeptide by a spacer (when Sp₁or Sp₂ is one or more amino acid residues) regardless of whether thetarget polypeptide, protein or protein fragment is fused to the amino orcarboxy terminus of the metal ion-affinity polypeptide.

[0020] In another embodiment, the recombinant polypeptide, protein orprotein fragment is defined by formula (I), wherein Z₁ is an amino acidselected from the group consisting of Asn and Lys; and Z₂ is an aminoacid selected from the group consisting of Gly and Lys; and R₁, R₂, Sp₁,and Sp₂ are as previously defined. For example, in one such embodiment,the recombinant polypeptide, protein or protein fragment is defined byformula (I) wherein Z₁ is Asn, Z₂ is Lys and R₁, R₂ μl, and Sp₂ are aspreviously defined. By way of further example, in another suchembodiment, the recombinant polypeptide, protein or protein fragment isdefined by formula (I) wherein Z₁ is Lys and Z₂ is Gly. In each of thesealternatives, the target polypeptide, protein or protein fragment (R₁ orR₂) may be at the carboxy or amino terminus of the metal ion-affinitypolypeptide. In addition, the target polypeptide, protein or proteinfragment (R₁ or R₂), may be directly fused (when Sp₁ or Sp₂ is acovalent bond) or separated from the metal ion-affinity polypeptide by aspacer (when Sp₁ or Sp₂ is one or more amino acid residues) regardlessof whether the target polypeptide, protein or protein fragment is fusedto the amino or carboxy terminus of the metal ion-affinity polypeptide.

[0021] In another embodiment, the recombinant polypeptide, protein orprotein fragment is defined by formula (I), wherein Z₁ is Ile, Z₂ isAsn, and R₁, R₂, Sp₁, and Sp₂ are as previously defined. Thus, forexample, in this embodiment the target polypeptide, protein or proteinfragment (R₁ or R₂) may be at the carboxy or amino terminus of the metalion-affinity polypeptide. In addition, the target polypeptide, proteinor protein fragment (R₁ or R₂), may be directly fused (when Sp₁ or Sp₂is a covalent bond) or separated from the metal ion-affinity polypeptideby a spacer (when Sp₁ or Sp₂ is one or more amino acid residues)regardless of whether the target polypeptide, protein or proteinfragment is fused to the amino or carboxy terminus of the metalion-affinity polypeptide.

[0022] In another embodiment, the recombinant polypeptide, protein orprotein fragment is defined by formula (I), wherein Z₁ is Thr, Z₂ isSer, and R₁, R₂, Sp₁, and Sp₂ are as previously defined. Thus, forexample, in this embodiment the target polypeptide, protein or proteinfragment (R₁ or R₂) may be at the carboxy or amino terminus of the metalion-affinity polypeptide. In addition, the target polypeptide, proteinor protein fragment (R₁ or R₂), may be directly fused (when Sp₁ or Sp₂is a covalent bond) or separated from the metal ion-affinity polypeptideby a spacer (when Sp₁ or Sp₂ is one or more amino acid residues)regardless of whether the target polypeptide, protein or proteinfragment is fused to the amino or carboxy terminus of the metalion-affinity polypeptide.

[0023] In another embodiment, the recombinant polypeptide, protein orprotein fragment is defined by formula (I), wherein Z₁ is Ser, Z₂ isTyr, and R₁, R₂, Sp₁, and Sp₂ are as previously defined. Thus, forexample, in this embodiment the target polypeptide, protein or proteinfragment (R₁ or R₂) may be at the carboxy or amino terminus of the metalion-affinity polypeptide. In addition, the target polypeptide, proteinor protein fragment (R₁ or R₂), may be directly fused (when Sp₁ or Sp₂is a covalent bond) or separated from the metal ion-affinity polypeptideby a spacer (when Sp₁ or Sp₂ is one or more amino acid residues)regardless of whether the target polypeptide, protein or proteinfragment is fused to the amino or carboxy terminus of the metalion-affinity polypeptide.

[0024] In another embodiment, the recombinant polypeptide, protein orprotein fragment is defined by formula (I), wherein Z₁ is Val, Z₂ isAla, and R₁, R₂, Sp₁, and Sp₂ are as previously defined. Thus, forexample, in this embodiment the target polypeptide, protein or proteinfragment (R₁ or R₂) may be at the carboxy or amino terminus of the metalion-affinity polypeptide. In addition, the target polypeptide, proteinor protein fragment (R₁ or R₂), may be directly fused (when Sp₁ or Sp₂is a covalent bond) or separated from the metal ion-affinity polypeptideby a spacer (when Sp₁ or Sp₂ is one or more amino acid residues)regardless of whether the target polypeptide, protein or proteinfragment is fused to the amino or carboxy terminus of the metalion-affinity polypeptide.

[0025] In another embodiment, the recombinant polypeptide, protein orprotein fragment is defined by formula (I), wherein Z₁ is Ala, Z₂ isLys, and R₁, R₂, Sp₁, and Sp₂ are as previously defined. Thus, forexample, in this embodiment the target polypeptide, protein or proteinfragment (R₁ or R₂) may be at the carboxy or amino terminus of the metalion-affinity polypeptide. In addition, the target polypeptide, proteinor protein fragment (R₁ or R₂), may be directly fused (when Sp₁ or Sp₂is a covalent bond) or separated from the metal ion-affinity polypeptideby a spacer (when Sp₁ or Sp₂ is one or more amino acid residues)regardless of whether the target polypeptide, protein or proteinfragment is fused to the amino or carboxy terminus of the metalion-affinity polypeptide.

[0026] In a further embodiment, R₁ may be a polypeptide which drivesexpression of the fusion protein and R₂ is the target polypeptide,protein or protein fragment. In this embodiment, each of Sp₁ and Sp₂ maybe a covalent bond or a spacer, independently of the other. Thus, forexample, R₁ may be directly fused to the metal ion-affinity peptide orseparated from the metal ion-affinity peptide by a spacer independentlyof whether R₂ is directly fused to the metal ion-affinity peptide orseparated from the metal ion-affinity peptide by a spacer; all of thesecombinations and permutations are contemplated. This type of arrangementis particularly useful when chimeric proteins are constructed whichcomprise epitopes from two portions of antigenic protein or from twodifferent antigenic proteins. Such chimeric proteins may be useful invaccine preparations.

[0027] In another embodiment, the recombinant polypeptides, proteins orprotein fragments of the present invention comprise multiple copies ofthe metal ion-affinity peptide (His-Z₁-His-Arg-His-Z₂-His) wherein Z₁and Z₂ are as previously defined. In this embodiment, the additionalcopies of the metal affinity peptide may occur in either or both of thespacer domains (Sp₁ and Sp₂) or in either or both of the other domains(R₁ and R₂) of the recombinant polypeptides, proteins or proteinfragments. Thus, for example, in one embodiment a second copy of themetal ion-affinity peptide (His-Z₁-His-Arg-His-Z₂-His) wherein Z₁ and Z₂are as previously defined is located in one of the spacer domains (Sp₁or Sp₂) or other domains (R₁ and R₂) of the recombinant polypeptides,proteins or protein fragments. By way of further example, in anotherembodiment two additional copies of the metal ion-affinity peptide(His-Z₁-His-Arg-His-Z₂-His) wherein Z₁ and Z₂ are as previously definedare located in the spacer domains (Sp₁ or Sp₂) or other domains (R₁ andR₂) of the recombinant polypeptides, proteins or protein fragments. Byway of further example, in another embodiment at least three additionalcopies of the metal ion-affinity peptide (His-Z₁-His-Arg-His-Z₂His)wherein Z₁ and Z₂ are as previously defined are located in the spacerdomains (Sp₁ or Sp₂) or other domains (R₁ and R₂) of the recombinantpolypeptides, proteins or protein fragments. In each of theseembodiments, the multiple copies of the metal ion-affinity peptide maybe separated by one or more amino acid residues (i.e., a spacer) asdescribed herein. Alternatively, in each of these embodiments themultiple copies of the metal ion-affinity peptide may be directly linkedto each other without any intervening amino acid residues. Thus, forexample, in one such embodiment the recombinant polypeptides, proteinsor protein fragments of the present invention may be defined by thegeneral formula (II):

R₁-Sp₁-(His-Z₁-His-Arg-His-Z₂-His)_(t)-Sp₂-R₂  (II)

[0028] wherein (His-Z₁-His-Arg-His-Z₂-His) is a metal ion-affinitypeptide; t is at least 2 and R₁, R₂, Z₁, Z₂, Sp₁ and Sp₂ are aspreviously defined. By way of further example, in one such embodimentthe recombinant polypeptides, proteins or protein fragments of thepresent invention may be defined by the general formula (III):

R₁-Sp₁-[(His-Z₁-His-Arg-His-Z₂-His)-Sp₂]_(t)—R₂  (III)

[0029] wherein (His-Z₁-His-Arg-His-Z₂-His) is a metal ion-affinitypeptide; t is at least 2 and R₁, R₂, Z₁, Z₂, Sp₁ and Sp₂ are aspreviously defined; in addition, each Sp₂ of the recombinantpolypeptides, proteins or protein fragments corresponding to generalformula (III) may be the same or different.

[0030] Target Polypeptide, Protein or Protein Fragment

[0031] The target polypeptide, protein or protein fragment may becomposed of any proteinaceous substance that can be expressed intransformed host cells. Accordingly, the present invention may bebeneficially employed to produce substantially any prokaryotic oreukaryotic, simple or conjugated, protein that can be expressed by avector in a transformed host cell. For example, the target protein maybe

[0032] a) an enzyme, whether oxidoreductase, transferase, hydrolase,lyase, isomerase or ligase;

[0033] b) a storage protein, such as ferritin or ovalbumin or atransport protein, such as hemoglobin, serum albumin or ceruloplasmin;

[0034] c) a protein that functions in contractile and motile systemssuch as actin or myosin;

[0035] d) any of a class of proteins that serve a protective or defensefunction, such as the blood protein fibrinogen or a binding protein,such as antibodies or immunoglobulins that bind to and thus neutralizeantigens;

[0036] e) a hormone such as human Growth Hormone, somatostatin,prolactin, estrone, progesterone, melanocyte, thyrotropin, calcitonin,gonadotropin and insulin;

[0037] f) a hormone involved in the immune system, such asinterleukin-1, interleukin-2, colony stimulating factor,macrophage-activating factor and interferon;

[0038] g) a toxic protein, such as ricin from castor bean or gossypinfrom cotton linseed;

[0039] h) a protein that serves as structural elements such as collagen,elastin, alpha-keratin, glyco-proteins, viral proteins andmuco-proteins; or

[0040] i) a synthetic protein, defined generally as any sequence ofamino acids not occurring in nature.

[0041] In general, the target polypeptide, protein or protein fragmentmay be a constituent of the R₁ and R₂ moieties of the recombinantpolypeptides, proteins or protein fragments corresponding to generalformulae (I), (II) and (III).

[0042] Genes coding for the various types of protein moleculesidentified above may be obtained from a variety of prokaryotic oreukaryotic sources, such as plant or animal cells or bacteria cells. Thegenes can be isolated from the chromosome material of these cells orfrom plasmids of prokaryotic cells by employing standard, well-knowntechniques. A variety of naturally occurring and synthesized plasmidshaving genes coding for many different protein molecules are notcommercially available from a variety of sources. The desired DNA alsocan be produced from mRNA by using the enzyme reverse transcriptase.This enzyme permits the synthesis of DNA from an RNA template.

[0043] In one embodiment, R₁ may be a protein which enhances expressionand R₂ is the target polypeptide, protein, or protein fragment. It iswell known that the presence of some proteins in a cell result inexpression of genes. If a chimeric protein contains an active portion ofthe protein which prompts or enhances expression of the gene encodingit, greater quantities of the protein may be expressed than if it werenot present.

[0044] Linker and Other Optional Elements

[0045] In one embodiment, the recombinant polypeptide, protein orprotein fragment includes a spacer (Sp₁ or Sp₂) between the metalion-affinity polypeptide and the target polypeptide, protein or proteinfragment. If present, the spacer may simply comprise one or more, e.g.,three to ten amino acid residues, separating the metal ion-affinitypeptide from the target polypeptide, protein or protein fragment.Alternatively, the spacer may comprise a sequence which imparts otherfunctionality, such as a proteolytic cleavage site, a fusion protein, asecretion sequence (e.g. OmpA or OmpT for E. coli, preprotrypsin formammalian cells, a-factor for yeast, and melittin for insect cells), aleader sequence for cellular targeting, antibody epitopes, or IRES(internal ribosomal entry sequences) sequences.

[0046] In one embodiment, the spacer is selected from among hydrophilicamino acids to increase the hydrophilic character of the recombinantpolypeptide, protein or protein fragment. Alternatively, the aminoacid(s) of the spacer domain may be selected to impart a desired foldingto the recombinant polypeptide, protein or protein fragment therebyincreasing accessability to one or more regions of the molecule. Forexample, the spacer domain may comprise glycine residues which resultsin a protein folding conformation which allows for improvedaccessibility to antibodies.

[0047] In another embodiment, the spacer comprises a cleavage site whichconsists of a unique amino acid sequence cleavable by use of asequence-specific proteolytic agent. Such a site would enable the metalion-affinity polypeptide to be readily cleaved from the targetpolypeptide, protein or protein fragment by digestion with a proteolyticagent specific for the amino acids of the cleavage site. Alternatively,the metal ion-affinity peptide may be removed from the desired proteinby chemical cleavage using methods known to the art.

[0048] When present, the cleavable site may be located at the amino orcarboxy terminus of the target peptide. Preferably, the cleavable siteis immediately adjacent the desired protein to enable separation of thedesired protein from the metal ion-affinity peptide. This cleavable sitepreferably does not appear in the desired protein. In one embodiment,the cleavable site is located at the amino terminus of the desiredprotein. If the cleavable site is located at the amino terminus of thedesired protein and if there are remaining extraneous amino acids on thedesired protein after cleavage with the proteolytic agent, anendopeptidase such as trypsin, clostropain or furin may be utilized toremove these remaining amino acids, thus resulting in a highly purifieddesired protein. Further examples of proteolytic enzymatic agents usefulfor cleavage are papain, pepsin, plasmin, thrombin, enterokinase, andthe like. Each effects cleavage at a particular amino acid sequencewhich it recognizes.

[0049] Digestion with a proteolytic agent may occur while the fusionprotein is still bound to the affinity resin or alternatively, thefusion protein may be eluted from the affinity resin and then digestedwith the proteolytic agent in order to further purify the desiredprotein. Preferably, the amino acid sequence of the proteolytic cleavagesite is unique, thus minimizing the possibility that the proteolyticagent will cleave the desired protein. In one embodiment, the cleavablesite comprises amino acids for an enterokinase, thrombin or a Factor Xacleavage site.

[0050] Enterokinase recognizes several sequences: Asp-Lys; Asp-Asp-Lys;Asp-Asp-Asp-Lys; and Asp-Asp-Asp-Asp-Lys. The only known naturaloccurrence of Asp-Asp-Asp-Asp-Lys is in the protein trypsinogen which isa natural substrate for bovine enterokinase and some yeast proteins. Assuch, by interposing a fragment containing the amino acid sequenceAsp-Asp-Asp-Asp-Lys as a cleavable site between the metal ion-affinitypolypeptide and the amino terminus of the target polypeptide, protein orprotein fragment, the metal ion-affinity polypeptide can be liberatedfrom the desired protein by use of bovine enterokinase with very littlelikelihood that this enzyme will cleave any portion of the desiredprotein itself.

[0051] Thrombin cleaves on the carboxy-terminal side of arginine in thefollowing sequence: Leu-Val-Pro-Arg-Gly-X, where X is a non-acidic aminoacid. Factor Xa protease (Le., the activated form of Factor X) cleavesafter the Arg in the following sequences: Ile-Glu-Gly-Arg-X,Ile-Asp-Gly-Arg-X, and Ala-Glu-Gly-Arg-X, where X is any amino acidexcept proline or arginine. A fusion protein comprising the 31amino-terminal residues of the cll protein, a Factor Xa cleavage siteand human β-globin was shown to be cleaved by Factor Xa and generateauthentic β-globin. A limitation of the Factor Xa-based fusion systemsis the fact that Factor Xa has been reported to cleave at arginineresidues that are not present within in the Factor Xa recognitionsequence. Nagai, K, et al., Prot. Expr. and Purif., 2:372, 1991.

[0052] While less preferred, other unique amino acid sequences for othercleavable sites may also be employed in the spacer without departingfrom the spirit or scope of the present invention. For instance, thespacer may be composed, at least in part, of a pair of basic aminoacids, i.e., Arg, His or Lys. This sequence is cleaved by kallikreins, aglandular enzyme. Also, the spacer may be composed, at least in part, ofArg-Gly, since it is known that the enzyme thrombin will cleave afterthe Arg if this residue is followed by Gly.

[0053] Regardless of whether a cleavage site is present, the recombinantpolypeptide, protein or protein fragment may comprise an antigenicdomain in a spacer region (Sp₁ or Sp₂). For example, in one embodimentof the present invention, the recombinant polypeptide, protein orprotein fragment comprises one or multiple copies of an antigenic domaingenerally corresponding to the FLAG® (Sigma-Aldrich, St. Louis, Mo.)peptide sequence joined to a linking sequence containing a singleenterokinase cleavage site. Such antigenic domains generally correspondto the sequence:

X²⁰—(X¹—Y—K—X²—X³-D-X⁴)_(n)—X⁵—(X¹—Y—K—X⁷—X⁸-D-X⁹—K)—X²¹

[0054] where:

[0055] D, Y and K are their representative amino acids;

[0056] X²⁰ and X²¹ are independently a hydrogen or a covalent bond;

[0057] each X¹ and X⁴ is independently a covalent bond or at least oneamino acid residue, if other than a covalent bond, preferably at leastone amino acid residue selected from the group consisting of aromaticamino acid residues and hydrophilic amino acid residues, more preferablyat least one hydrophilic amino acid residue, and still more preferablyat least one an aspartate residue;

[0058] each X², X³, X⁷ and X⁸ is independently an amino acid residue,preferably an amino acid residue selected from the group consisting ofaromatic amino acid residues and hydrophilic amino acid residues, morepreferably a hydrophilic amino acid residue, and still more preferablyan aspartate residue;

[0059] X⁵ is a covalent bond or a spacer domain comprising at least oneamino acid, if other than a covalent bond, preferably a histidineresidue, a glycine residue or a combination of multiple or alternatinghistidine residues, said combination comprising His-Gly-His, or-(His-X)_(m)—, wherein m is 1 to 6 and X is selected from the groupconsisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu,Lys, Phe, Pro, Ser, Thr, Trp, Tyr and Val;

[0060] X⁹ is a covalent bond or D; and

[0061] n is 0, 1 or 2.

[0062] In this embodiment, the amino acid sequenceX²⁰—(X¹—Y—K—X²—X³-D-X⁴)_(n) comprises an antigenic domain—X¹—Y—K—X²—X³-D- joined in tandem which are joined to a linking sequence(X¹—Y—K—X⁷—X⁸-D-X⁹—K). The antigenic domains may be immediately adjacentto each other when n is at least one and X⁴ is a covalent bond;optionally, X⁴ may be a spacer domain interposed between the multiplecopies of antigenic domains. The linking sequence contains a singleenterokinase cleavable site which is represented by the sequence—X⁷—X⁸-D-X⁹—K, where X⁷ and X⁸ may be an amino acid residue or acovalent bond and X⁹ is a covalent bond or an aspartate residue. In oneembodiment, each X⁷, X⁸ and X⁹ is independently an aspartate residuethus resulting in the enterokinase cleavable site DDDDK which ispreferably located immediately adjacent to the amino terminus of thetarget peptide. When n is at least one and X⁵ is a covalent bond, themultiple copies of antigenic domains may be immediately adjacent to thelinking sequence; optionally, X⁵ may be a spacer domain interposedbetween the linking sequence and the antigenic domains. When each X⁴ andX⁵ is independently a spacer domain, it is preferred that the amino acidresidue(s) of each X⁴ and X⁵ impart one or more desired properties tothe antigenic domain; for example, the amino acids of the spacer domainmay be selected to impart a desired folding to the identificationpolypeptide thereby increasing accessibility to the antibody. In anotherembodiment, the amino acids of the spacer domain X⁴ and X⁵ may beselected to impart a desired affinity characteristic such as acombination of multiple or alternating histidine residues capable ofchelating to an immobilized metal ion on a resin or other matrix.Furthermore, these desired properties may be designed into other areasof the identification polypeptide; for example, the amino acidsrepresented by X² and X³ may be selected to impart a desired peptidefolding or a desired affinity characteristic for use in affinitypurification.

[0063] In another embodiment, the spacer comprises multiple copies of anantigenic domain. For example, in one embodiment the spacer may comprisea linking sequence containing a single enterokinase or other cleavagesite, or generally correspond to the sequence:

X²⁰-(D-Y—K—X²—X³-D)_(n)—X⁵-(D-Y—K—X⁷—X⁸-D-X⁹—K)—X²¹

[0064] where:

[0065] D, Y, K are their representative amino acids;

[0066] X²⁰ and X²¹ are independently a hydrogen or a covalent bond;

[0067] each X², X³, X⁷ and X⁸ is independently an amino acid residue,preferably an amino acid residue selected from the group consisting ofaromatic amino acid residues and hydrophilic amino acid residues, morepreferably a hydrophilic amino acid residue, and still more preferablyan aspartate residue;

[0068] X⁵ is a covalent bond or a spacer domain comprising at least oneamino acid, if other than a covalent bond, preferably a histidineresidue, a glycine residue or a combination of multiple or alternatinghistidine residues, said combination comprising His-Gly-His, or-(His-X)_(m)—, wherein m is 1 to 6 and X is selected from the groupconsisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr and Val;

[0069] X⁹ is a covalent bond or an aspartate residue; and

[0070] n is at least 2.

[0071] In this embodiment, the amino acid sequenceX²⁰-(D-Y—K—X²—X³-D)_(n) represents the multiple copies of the antigenicdomain D-Y—K—X²—X³-D in tandem which are joined to a linking sequence(D-Y—K—X⁷—X⁸-D-X⁹—K). In this embodiment, one antigenic domain isimmediately adjacent to another antigenic domain, i.e., no interveningspacer domains, and the multiple copies of the antigenic domain areimmediately adjacent to the linking sequence when X⁵ is a covalent bond.The linking sequence contains a single enterokinase cleavable site whichis represented by the sequence —X⁷—X⁸-D-X⁹—K, where X⁷ and X⁸ may be acovalent bond or an amino acid residue, preferably an aspartate residue,and X⁹ is a covalent bond or an aspartate residue. In one embodiment,each X⁷, X⁸ and X⁹ is independently an aspartate residue thus resultingin the enterokinase cleavable site DDDDK which is preferably adjacent tothe amino terminus of the target peptide. Optionally, the multiplecopies of the antigenic domain are joined to the linking sequence by aspacer X⁵ when X⁵ is at least one amino acid residue. When X⁵ is aspacer domain, it is preferred that the amino acid residue(s) of X⁵impart one or more desired properties to the recombinant polypeptide,protein or protein fragment; for example, the amino acids of the spacerdomain may be selected to impart a desired folding to the recombinantpolypeptide, protein or protein fragment thereby increasingaccessibility to the antibody. In another embodiment, the amino acids ofthe spacer domain may be selected to impart a desired affinitycharacteristic such as a combination of multiple or alternatinghistidine residues capable of chelating to an immobilized metal ion on aresin or other matrix. Furthermore, these desired properties may bedesigned into other areas of the spacer; for example, the amino acidsrepresented by X² and X³ may be selected to impart a desired peptidefolding or a desired affinity characteristic for use in affinitypurification.

[0072] When the affinity polypeptide is located at the amino terminus ofthe target polypeptide, protein or protein fragment, it is oftendesirable to design the amino acid sequence such that an initiatormethionine is present. Accordingly, in one embodiment of the presentinvention, the recombinant polypeptide, protein or protein fragmentcomprises multiple copies of an antigenic domain, a linking sequencecontaining a single enterokinase cleavage site and generally correspondsto the sequence:

X²⁰—X¹⁰-(D-Y—K—X²—X³-D)_(n)—X⁵-(D-Y—K—X⁷—X⁸-D-X⁹—K)—X²¹

[0073] where:

[0074] D, Y, and K are their representative amino acids;

[0075] X²⁰ and X²¹ are independently a hydrogen or a covalent bond;

[0076] X¹⁰ is a covalent bond or an amino acid, if other than a covalentbond, preferably a methionine residue;

[0077] each X², X³, X⁷ and X⁸ is independently an amino acid residue,preferably an amino acid residue selected from the group consisting ofaromatic amino acid residues and hydrophilic amino acid residues, morepreferably a hydrophilic amino acid residue, and still more preferablyan aspartate residue;

[0078] X⁵ is a covalent bond or a spacer domain comprising at least oneamino acid, if other than a bond, preferably a histidine residue, aglycine residue or a combination of multiple or alternating histidineresidues, said combination comprising His-Gly-His, or -(His-X)_(m)—,wherein m is 1 to 6 and X is selected from the group consisting of Ala,Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro,Ser, Thr, Trp, Tyr, and Val;

[0079] X⁹ is a covalent bond or an aspartate residue; and

[0080] n is at least 2.

[0081] In this embodiment, the amino acid sequenceX²⁰-(D-Y—K—X²—X³-D)_(n) represents the multiple copies of the antigenicdomain D-Y—K—X²—X³-D in tandem which is flanked by a linking sequence(D-Y—K—X⁷—X⁸-D-X⁹—K) and an initiator amino acid X¹⁰, preferablymethionine. The antigenic domain D-Y—K—X²—X³-D with an initiatormethionine is recognized by the M5® antibody (Sigma-Aldrich, St. Louis,Mo.). In this embodiment, one antigenic domain is immediately adjacentto another antigenic domain, i.e., no intervening spacer domains, andthe multiple copies of the antigenic domain are immediately adjacent tothe linking sequence when X⁵ is a covalent bond. The linking sequencecontains an enterokinase cleavable site which is represented by theamino acid sequence —X⁷—X⁸-D-X⁹—K, where X⁷ and X⁸ may be a covalentbond or an amino acid residue, preferably an aspartate residue, and X⁹is a covalent bond or an aspartate residue. In one embodiment, each X⁷,X⁸ and X⁹ is independently an aspartate residue thus resulting in theenterokinase cleavable site DDDDK which is preferably adjacent to theamino terminus of the target peptide. Optionally, the multiple copies ofthe antigenic domain are joined to the linking sequence by a spacerdomain X⁵ when X⁵ is at least one amino acid residue. When X⁵ is aspacer domain, it is preferred that the amino acid residue(s) of X⁵impart one or more desired properties to the affinity polypeptide; forexample, the amino acids of the spacer domain may be selected to imparta desired folding to the recombinant polypeptide, protein or proteinfragment thereby increasing accessibility to the antibody. In anotherembodiment, the amino acids of the spacer domain may be selected toimpart a desired affinity characteristic such as a combination ofmultiple or alternating histidine residues capable of chelating to animmobilized metal ion on a resin or other matrix. Furthermore, thesedesired properties may be designed into other areas of the affinitypolypeptide; for example, the amino acids represented by X² and X³ maybe selected to impart a desired peptide folding or a desired affinitycharacteristic for use in affinity purification.

[0082] In another embodiment of the present invention, the recombinantpolypeptide, protein or protein fragment comprises one or more copies ofan antigenic sequence, a linking sequence containing a singleenterokinase cleavable site and generally corresponds to the sequence:

X²⁰-(D-X¹¹—Y—X¹²—X¹³)_(n)—X¹⁴-(D-X¹¹—Y—X¹²—X¹³-D-X¹⁵—K)—X²¹

[0083] where:

[0084] D, Y and K are their representative amino acids;

[0085] X²⁰ and X²¹ are independently a hydrogen or a covalent bond;

[0086] each X¹¹ is a covalent bond or an amino acid, preferably Leu;

[0087] each X¹² is an amino acid, preferably selected from the groupconsisting of aromatic amino acid residues and hydrophilic amino acidresidues, more preferably a hydrophilic amino acid residue, and stillmore preferably an aspartate residue;

[0088] each X¹³ is a covalent bond or at least one amino acid, if otherthan a covalent bond, preferably selected from the group consisting ofaromatic amino acid residues and hydrophilic amino acid residues, morepreferably a hydrophilic amino acid residue, and still more preferablyan aspartate residue;

[0089] X¹⁴ is a covalent bond or a spacer domain comprising at least oneamino acid, if other than a covalent bond, preferably a histidineresidue, a glycine residue or a combination of multiple or alternatinghistidine residues, said combination comprising His-Gly-His, or-(His-X)_(m)—, wherein m is 1 to 6 and X is selected from the groupconsisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, lie, Leu,Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr and Val;

[0090] X¹⁵ is a covalent bond or an aspartate residue; and

[0091] n is at least 0 or at least 1.

[0092] In this embodiment, when n is at least 2, the amino acid sequenceX²⁰-(D-X¹¹—Y—X¹²—X¹³)_(n) constitutes multiple copies of the antigenicdomain D-X¹¹—Y—X¹²—X¹³ in tandem which are joined to a linking sequence(D-X¹¹—Y—X¹²—X³-D-X¹⁵—K). Additionally, one antigenic domain may beimmediately adjacent to another antigenic domain, i.e., no interveningspacer domains, and the multiple copies of the antigenic domain may beimmediately adjacent to the linking sequence when X¹⁴ is a covalentbond. The linking sequence contains a single enterokinase cleavable sitewhich is represented by the sequence —X¹²—X¹³-D-X¹⁵—K where X¹² and X¹³may be a covalent bond or an amino acid residue, preferably an aspartateresidue, and X¹⁵ is a covalent bond or an aspartate residue. In oneembodiment, each X¹², X¹³ and X¹⁵ is independently an aspartate residuethus resulting in the enterokinase cleavable site DDDDK which ispreferably adjacent to the amino terminus of the target peptide.Optionally, when n is at least two, the multiple copies of the antigenicdomain are joined to the linking sequence by a spacer X¹⁴ when X¹⁴ is atleast one amino acid residue. When X¹⁴ is a spacer domain, it ispreferred that the amino acid residue(s) of X¹⁴ impart one or moredesired properties to the recombinant polypeptide, protein or proteinfragment; for example, the amino acids of the spacer domain may beselected to impart a desired folding to the recombinant polypeptide,protein or protein fragment thereby increasing accessibility to theantibody. In another embodiment, the amino acids of the spacer domainX¹⁴ may be selected to impart a desired affinity characteristic such asa combination of multiple or alternating histidine residues capable ofchelating to an immobilized metal ion on a resin or other matrix.

[0093] In another embodiment of this invention, a spacer (Sp₁ or Sp₂)comprises the enzyme glutathione-S-transferase of the parasite helminthSchistosoma japonicum (SEQ ID NO: 1). The glutathione-S-transferase may,however, be derived from other species including human and othermammalian glutathione-S-transferase. Proteins expressed as fusions withthe enzyme glutathione-S-transferase can be purified undernon-denaturing conditions by affinity chromatography on immobilizedglutathione. Glutathione-agarose beads have a capacity of at least 8 mgfusion protein/ml swollen beads and can be used several times fordifferent preparations of the same fusion protein. Smith, D. B. andJohnson, K. S., Gene, 67:31-40, 1988.

[0094] In another embodiment of this invention, a spacer (Sp₁ or Sp₂)comprises a cellulose binding domain (CBD) (SEQ ID NO: 2). CBD's arefound in both bacterial and fungal sources and possess a high affinityfor the crystalline form of cellulose. This property has been useful forpurification of fusion proteins using a cellulose matrix. Fusionproteins have been attached at both the N- and C-terminus of CBD.

[0095] In another embodiment of this invention, a spacer (Sp₁ or SP2)comprises the Maltose Binding Protein (MBP) encoded by the malE gene inE. coli (SEQ ID NO: 3). MBP has found utility in the formation ofchimeric proteins with eukaryotic proteins for expression in bacterialsystems. This system permits expression of soluble fusion proteins thatcan readily be purified on immobilized amylose resin.

[0096] In another embodiment of this invention, a spacer (Sp₁ or Sp₂)comprises Protein A (SEQ ID NO: 4). Protein A is isolated fromStaphylococcus aureus and binds to the Fc origin of IgG. Fusion proteinscontaining the IgG binding domains of Protein A can be affinity purifiedon IgG resins (e.g., IgG Sepharose 6FF (Pharmacia Biotech). The signalsequence of Protein A is functional in E. coli. Fusion proteins usingProtein A have shown increased stability when expressed both in thecytoplasm and periplasm in E. coli.

[0097] In another embodiment of this invention, a spacer (Sp₁ or Sp₂)comprises Protein G (SEQ ID NO: 5). Protein G is similar to Protein Awith the difference being that Protein G binds to human serum albumin inaddition to IgG. The major disadvantage is that low pH<3.4 is requiredto elute the fusion protein.

[0098] In another embodiment of this invention, a spacer (Spa or Sp₂)comprises IgG (SEQ ID NO: 6). Placing the protein of interest on theC-terminal of IgG generates chimeric proteins. This allows purificationof the fusion protein using either Protein A or G matrix.

[0099] In another embodiment of this invention, a spacer (Sp₁ or Sp₂)comprises the enzyme chloramphenicol acetyl transferase (CAT) from E.coli (SEQ ID NO: 7). CAT is used in the form of a C-terminal fusion. CATis readily translated in E. coli and allows for over-expression ofheterologous proteins. Capture of fusion proteins is accomplished usinga chloramphenicol matrix.

[0100] In another embodiment of this invention, a spacer (Sp₁ or Sp₂)comprises streptavidin (SEQ ID NO: 8). Streptavidin is used for fusionproteins because of its high affinity and high specificity for biotin.Streptavidin is a neutral protein, free from carbohydrates andsulphydryl groups.

[0101] In another embodiment of this invention, a spacer (Sp₁ or Sp₂)comprises b-galactosidase (SEQ ID NO: 9). b-galactosidase is a enzymethat is utilized as both an N- and C-terminal fusion protein. Fusionproteins containing b-galactosidase sequences can be affinity purifiedon aminophenyl-b-D-thiogalactosidyl-succinyldiaminohexyl-Sepharose.However, given that C-terminal fusion proteins are usually insoluble,the system has limited use in bacterial systems. N-terminal fusions aresoluble in E. coli, but due to the large size of b-galactosidase, thissystem is used more often in eukaryotic gene expression.

[0102] In another embodiment of this invention, a spacer (Sp₁ or Sp₂)comprises the Green Fluorescent Protein (GFP) (SEQ ID NO: 10). GFP is aprotein from the jellyfish Aquorea victorea and many mutant variationsof this protein have been used successfully in most organisms forprotein expression. The major use of these types of fusion proteins isfor targeting and determining physiological function of the host cellprotein.

[0103] In another embodiment of this invention, a spacer (Sp₁ or Sp₂)comprises thioredoxin (SEQ ID NO: 11). Thioredoxin is a relatively smallthermostable protein that is easily over-expressed in bacterial systems.Thioredoxin fusion systems are useful in avoiding the formation ofinclusion bodies during heterologous gene expression. This has beenparticularly useful in the expression of mammalian cytokines.

[0104] In another embodiment of this invention, a spacer (Sp₁ or Sp₂)comprises Calmodulin Binding Protein (CBP) (SEQ ID NO: 12). This tag isderived from the C-terminus of skeletal muscle myosin light chainkinase. This small tag is recognized by calmodulin and forms the base ofthe technology. The tag is translated efficiently and allows for theexpression and recovery of N-terminal chimeric genes.

[0105] In another embodiment of this invention, a spacer (Sp₁ or Sp₂)comprises the c-myc epitope sequenceGlu-Gln-Lys-Leu-Ile-Ser-Glu-Glu-Asp-Leu (SEQ ID NO: 13). This C-terminalportion of the myc oncogene, which is part of the p53 signaling pathway,has been used as a detection tag for expression of recombinant proteinsin mammalian cells.

[0106] In another embodiment of this invention, a spacer (Sp₁ or Sp₂)comprises the HA epitope sequence Tyr-Pro-Tyr-Asp-Val-Tyr-Ala (SEQ IDNO: 14). This detection tag has been utilized for the expression ofrecombinant proteins in mammalian cells.

[0107] In another embodiment of this invention, the spacer (Sp₁ or Sp₂)comprises a polypeptide possessing an amino acid sequence having atleast 70% homology to any one of the amino acid sequences disclosed inSEQ ID NOS:1-14, and retains the same binding characteristics as saidamino acid sequence.

[0108] DNA sequences encoding the aforementioned proteins which may beemployed as spacers (Sp₁ or Sp₂) are commercially available (e.g., malEgene sequences encoding the MBP are available from New England Biolabs(pMAL-c2 and pMAL-p2); Schistosoma japonicum glutathione-S-transferase(GST) gene sequences are available from Pharmacia Biotech (the pGEXseries which have GenBank Accession Nos.: U13849 to U13858);β-galactosidase (the lacZ gene product) gene sequences are availablefrom Pharmacia Biotech (pCH110 and pMC1871; GenBank Accession Nos:U13845 and L08936, respectively); sequences encoding the IgG bindingdomains of Protein A are available from Pharmacia Biotech (pRIT2T;GenBank Accession No. U13864)).

[0109] When any of the above listed proteins (including the hinge/Fcdomains of human IgG₁) are used as spacers, it is not required that theentire protein be used as a spacer. Portions of these proteins may beused as the spacer provided the portion selected is sufficient to permitinteraction of a fusion protein containing the portion of the proteinused as the spacer with the desired affinity resin.

[0110] Expression and Purification

[0111] The polypeptides, proteins and protein fragments of the presentinvention are generally prepared and expressed as a fusion protein usingconventional recombinant DNA technology. The fusion protein is thusproduced by host cells transformed with the genetic information encodingthe fusion protein. The host cells may secrete the fusion protein intothe culture media or store it in the cells whereby the cells must becollected and disrupted in order to extract the product. As hosts, E.coli, yeast, insect cells, mammalian cells and plants are suitable. Ofthese two, E. coli will typically be the more preferred host for mostapplications. In one embodiment, the recombinant polypeptides, proteinsand protein fragments are produced in a soluble form or secreted fromthe host.

[0112] In general, a chimeric gene is inserted into an expression vectorwhich allows for the expression of the desired fusion protein in asuitable transformed host. The expression vector provides the insertedchimeric gene with the necessary regulatory sequences to controlexpression in the suitable transformed host.

[0113] There are six elements of control expression sequence forproteins which are to be secreted from a host into the medium, whilefive of these elements apply to fusion proteins expressedintracellularly. These elements in the order they appear in the geneare: a) the promoter region; b) the 5′ untranslated region; c) signalsequence; d) the chimeric coding sequence; e) the 3′ untranslatedregion; f) the transcription termination site. Fusion proteins which arenot secreted do not contain c), the signal sequence.

[0114] The recombinant expression vectors of the invention comprise anucleic acid of the invention in a form suitable for expression of thenucleic acid in a host cell. This means that the recombinant expressionvectors include one or more regulatory sequences, selected on the basisof the host cells to be used for expression, operably linked to thenucleic acid sequence to be expressed. It will be appreciated by thoseskilled in the art that the design of the expression vector can dependon such factors as the choice of the host cell to be transformed, thelevel of expression of protein desired, etc. The expression vectors ofthe invention can be introduced into host cells to thereby produceproteins or peptides, including fusion proteins or peptides, encoded bynucleic acids as described herein.

[0115] Vector DNA can be introduced into prokaryotic or eukaryotic cellsvia conventional transformation or transfection techniques. For stabletransfection of mammalian cells, it is known that, depending upon theexpression vector and transfection technique used, only a small fractionof cells may integrate the foreign DNA into their genome. In order toidentify and select these integrants, a gene that encodes a selectablemarker (e.g., for resistance to antibiotics) is generally introducedinto the host cells along with the gene of interest. Preferredselectable markers include those which confer resistance to drugs, suchas G418, hygromycin, and methotrexate. Nucleic acid encoding aselectable marker can be introduced into a host cell on the same vectoras that encoding the metal ion-affinity peptide containing fusionprotein or can be introduced on a separate vector. Cells stablytransfected with the introduced nucleic acid can be identified by drugselection (e.g., cells that have incorporated the selectable marker genewill survive, while the other cells die). Methods and materials forpreparing recombinant vectors, transforming host cells using replicatingvectors, and expressing biologically active foreign polypeptides andproteins are generally well known.

[0116] The expressed recombinant polypeptides, proteins and proteinfragments may be separated from other material present in the secretionmedia or extraction solution, or from other liquid mixtures, throughimmobilized metal affinity chromatography (“IMAC”). For example, theculture media containing the secreted recombinant polypeptides, proteinsand protein fragments or the cell extracts containing the recombinantpolypeptides, proteins and protein fragments may be passed through acolumn that contains a resin comprising an immobilized metal ion. InIMAC, metal ions are immobilized onto to a solid support, and used tocapture proteins comprising a metal chelating peptide. The metalchelating peptide may occur naturally in the protein, or the protein maybe a recombinant protein with an affinity tag comprising a metalchelating peptide. Exemplary metal ions include aluminum, cadmium,calcium, cobalt, copper, gallium, iron, nickel, ytterbium and zinc. Inone embodiment, the metal ion is preferably nickel, copper, cobalt, orzinc. In another embodiment, the metal ion is nickel. Advantageously,the components of the solution other than recombinant polypeptide,protein or protein fragment freely pass through the column. Theimmobilized metal, however, chelates or binds the recombinantpolypeptides, proteins and protein fragments, thereby separating it fromthe remaining contents of the liquid mixture in which it was originallycontained.

[0117] Resins useful for producing immobilized metal ion affinitychromatography (IMAC) columns are available commercially. Examples ofresins derivatized with iminodiacetic acid (IDA) are Chelating Sepharose6B (Pharmacia), Immobilized Iminodiacetic Acid (Pierce), andIminodiacetic Acid Agarose (Sigma-Aldrich). In addition, Porath hasimmobilized tris(carboxymethyl)ethylenediamine (TED) on Sepharose 6B andused it to fractionate serum proteins. Porath, J. and Olin, B.,Biochemistry, 22:1621-1630,1983. Other reports suggest that trisacrylGF2000 and silica can be derivatized with IDA, TED, or aspartic acid,and the resulting materials used in producing IMAC substances.

[0118] In one embodiment, the capture ligand is a metal chelate asdescribed in WO 01/81365. More specifically, in this embodiment thecapture ligand is a metal chelate derived from metal chelatingcomposition (1):

[0119] wherein

[0120] Q is a carrier;

[0121] S¹ is a spacer;

[0122] L is -A-T-CH(X)— or —C(═O)—;

[0123] A is an ether, thioether, selenoether, or amide linkage;

[0124] T is a bond or substituted or unsubstituted alkyl or alkenyl;

[0125] X is —(CH₂)_(k)CH₃, —(CH₂)_(k)COOH, —(CH₂)_(k)SO₃H,—(CH₂)_(k)PO₃H₂, —(CH₂)_(k)N(J)₂, or —(CH₂)_(k)P(J)₂, preferably—(CH₂)_(k)COOH or —(CH₂)_(k)SO₃H;

[0126] k is an integer from 0 to 2;

[0127] J is hydrocarbyl or substituted hydrocarbyl;

[0128] Y is —COOH, —H, —SO₃H, —PO₃H₂, —N(J)₂, or —P(J)₂, preferably,—COOH;

[0129] Z is —COOH, —H, —SO₃H, —PO₃H₂, —N(J)₂, or —P(J)₂, preferably,—COOH; and

[0130] i is an integer from 0 to 4, preferably 1 or 2.

[0131] In general, the carrier, Q, may comprise any solid or solublematerial or compound capable of being derivatized for coupling. Solid(or insoluble) carriers may be selected from a group including agarose,cellulose, methacrylate copolymers, polystyrene, polypropylene, paper,polyamide, polyacrylonitrile, polyvinylidene, polysulfone,nitrocellulose, polyester, polyethylene, silica, glass, latex, plastic,gold, iron oxide and polyacrylamide, but may be any insoluble or solidcompound able to be derivatized to allow coupling of the remainder ofthe composition to the carrier, Q. Soluble carriers include proteins,nucleic acids including DNA, RNA, and oligonucleotides, lipids,liposomes, synthetic soluble polymers, proteins, polyamino acids,albumin, antibodies, enzymes, streptavidin, peptides, hormones,chromogenic dyes, fluorescent dyes, flurochromes or any other detectionmolecule, drugs, small organic compounds, polysaccharides and any othersoluble compound able to be derivatized for coupling the remainder ofthe composition to the carrier, Q. In one embodiment, the carrier, Q, isthe container of the present invention. In another embodiment, thecarrier, Q, is a body provided within the container of the presentinvention.

[0132] The spacer, S¹, which flanks the carrier comprises a chain ofatoms which may be saturated or unsaturated, substituted orunsubstituted, linear or cyclic, or straight or branched. Typically, thechain of atoms defining the spacer, S¹, will consist of no more thanabout 25 atoms; stated another way, the backbone of the spacer willconsist of no more than about 25 atoms. More preferably, the chain ofatoms defining the spacer, S¹, will consist of no more than about 15atoms, and still more preferably no more than about 12 atoms. The chainof atoms defining the spacer, S¹, will typically be selected from thegroup consisting of carbon, oxygen, nitrogen, sulfur, selenium, siliconand phosphorous and preferably from the group consisting of carbon,oxygen, nitrogen, sulfur and selenium. In addition, the chain atoms maybe substituted or unsubstituted with atoms other than hydrogen such ashydroxy, keto (═O), or acyl such as acetyl. Thus, the chain mayoptionally include one or more ether, thioether, selenoether, amide, oramine linkages between hydrocarbyl or substituted hydrocarbyl regions.Exemplary spacers, S¹, include methylene, alkyleneoxy (—(CH₂)_(a)O—),alkylenethioether (—(CH₂)_(a)S—), alkyleneselenoether (—(CH₂)_(a)Se—),alkyleneamide (—(CH₂)_(a)NR¹(C═O)—), alkylenecarbonyl (—(CH₂)_(a)CO)—,and combinations thereof wherein a is generally from 1 to about 20 andR¹ is hydrogen or hydrocarbyl, preferably alkyl. In one embodiment, thespacer, S¹, is a hydrophilic, neutral structure and does not contain anyamine linkages or substituents or other linkages or substituents whichcould become electrically charged during the purification of apolypeptide.

[0133] As noted above, the linker, L, may be -A-T-CH(X)— or —C(═O)—.When L is -A-T-CH(X)—, the chelating composition corresponds to theformula:

[0134] wherein Q, S₁, A, T, X, Y, and Z are as previously defined. Inthis embodiment, the ether (—O—), thioether (—S—), selenoether (—Se—) oramide ((—NR¹(C═O)—) or (—(C═O)NR¹—) wherein R¹ is hydrogen orhydrocarbyl) linkage is separated from the chelating portion of themolecule by a substituted or unsubstituted alkyl or alkenyl region. Ifother than a bond, T is preferably substituted or unsubstituted C₁ to C₆alkyl or substituted or unsubstituted C₂ to C₆ alkenyl. More preferably,A is —S—, T is —(CH₂)_(n)—, and n is an integer from 0 to 6, typically 0to 4, and more typically 0, 1 or 2.

[0135] When L is —C(═O)—, the chelating composition corresponds to theformula:

[0136] wherein Q, S¹, i, Y, and Z are as previously defined.

[0137] In one embodiment, the sequence —S¹-L-, in combination, is achain of no more than about 35 atoms selected from the group consistingof carbon, oxygen, sulfur, selenium, nitrogen, silicon and phosphorous,more preferably only carbon, oxygen sulfur and nitrogen, and still morepreferably only carbon, oxygen and sulfur. To reduce the prospects fornon-specific binding, nitrogen, when present, is preferably in the formof an amide moiety. In addition, if the carbon chain atoms aresubstituted with anything other than hydrogen, they are preferablysubstituted with hydroxy or keto. In one embodiment, L comprises aportion (sometimes referred to as a fragment or residue) derived from anamino acid such as cystine, homocystine, cysteine, homocysteine,aspartic acid, cysteic acid or an ester thereof such as the methyl orethyl ester thereof.

[0138] Exemplary chelating compositions corresponding to formula 1include the following:

[0139] wherein Q is a carrier and Ac is acetyl.

[0140] In another embodiment, the capture ligand is a metal chelate ofthe type described in U.S. Pat. No. 5,047,513. More specifically, inthis embodiment the capture ligand is a metal chelate derived fromnitrilotriacetic acid derivatives of the formula

[0141] wherein S² is —O—CH₂—CH(OH)—CH₂— or —O—CO— and x is 2, 3 or 4. Inthis embodiment, the nitrilotriacetic acid derivative is immobilized onany of the previously described carriers, Q.

[0142] In these embodiments in which the capture ligand is a metalchelate as described in WO 01/81365 or U.S. Pat. No. 5,047,513, themetal chelate may contain any of the metal ions previously described inconnection with IMAC. In one embodiment, the metal chelate comprises ametal ion selected from among nickel (Ni²⁺), zinc (Zn²⁺), copper (Cu²⁺),iron (Fe³⁺), cobalt (Co²+), calcium (Ca²⁺), aluminum (Al³⁺), magnesium(Mg²⁺), and manganese (Mn²⁺). In another embodiment, the metal chelatecomprises nickel (Ni²⁺).

[0143] Another common purification technique that can be used in thecontext of the present invention is the use of an immunogenic capturesystem where the recombinant polypeptide, protein or protein fragmentcomprises an antigenic domain in a spacer region (Sp₁ or Sp₂). Any ofthe previously described antigenic systems comprising the spacer may beused for this purpose. In such systems, an epitope tag on a protein orpeptide allows the protein to which it is attached to be purified basedupon the affinity of the epitope tag for a corresponding ligand (e.g.,antibody) immobilized on a support. One example of such a tag is thesequence Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys, or DYKDDDDK (SEQ ID NO:15);antibodies having specificity for this sequence are sold bySigma-Aldrich (St. Louis, Mo.) under the FLAG® trademark. Anotherexample of such a tag is the sequence Asp-Leu-Tyr-Asp-Asp-Asp-Asp-Lys,or DLYDDDDK (SEQ ID NO:16); antibodies having specificity for thissequence are sold by Invitrogen (Carlsbad, Calif.). Another example ofsuch a tag is the 3×FLAG® sequence Met-Asp-Tyr-Lys-Asp-His-Asp-Gly-Asp-Tyr-Lys-Asp-H is-Asp-Ile-Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys(SEQ ID NO:17); antibodies having specificity for this sequence are soldby Sigma-Aldrich (St. Louis, Mo.). Thus, in one embodiment, the carriercomprises immobilized antibodies which have specificity for the DYKDDDDKepitope; in another embodiment, the carrier comprises immobilizedantibodies which have specificity for the DLYDDDDK epitope. In anotherembodiment, the carrier comprises immobilized antibodies which havespecificity for SEQ ID NO: 17. For example, in one embodiment, theANTI-FLAG® M1, M2, or M5 antibody is immobilized on the interior surfaceof a column, or a portion thereof, and/or a bead or other support withina column.

[0144] After the recombinant polypeptides, proteins and proteinfragments are separated from other components of the liquid mixture, theconditions in the column may be changed to release the bound material.For example, the bound molecules may be eluted by pH change, imidazole,or competition with another linker peptide from the column.

[0145] Alternatively, the target polypeptide, protein or proteinfragment portion of the bound recombinant polypeptide, protein orprotein fragment may be selectively released from immobilized metal. Forexample, if there is a cleavage site between the target polypeptide,protein or protein fragment and the metal ion-affinity peptide, and ifthe bound recombinant polypeptide, protein or protein fragment istreated with the appropriate enzyme, the target polypeptide, protein orprotein fragment may be selectively released while the metalion-affinity polypeptide fragment remains bound to the immobilizedmetal. For this purpose, the cleavage is preferably an enzymaticallycleavable linker peptide having the ability to undergo site-specificproteolysis. Suitable cleaving enzymes in accordance with this inventionare activated factor X (factor Xa), DPP I, DPP II, DPP IV,carboxylpeptidase A, collagen, enterokinase, human renin, thrombin,trypsin, ubtilisn and V5.

[0146] It is to be appreciated that some polypeptide or proteinmolecules will possess the desired enzymatic or biological activity withthe metal chelate peptide still attached either at the C-terminal end orat the N-terminal end or both. In those cases the purification of thechimeric protein will be accomplished without subjecting the protein tosite-specific proteolysis.

[0147] The present invention may be used to purify any prokaryotic oreukaryotic protein that can be expressed as the product of recombinantDNA technology in a transformed host cell. These recombinant proteinproducts include hormones, receptors, enzymes, storage proteins, bloodproteins, mutant proteins produced by protein engineering techniques, orsynthetic proteins. The purification process of the present inventioncan be used batchwise or in continuously run columns.

[0148] It is to be understood that the present invention has beendescribed in detail by way of illustration and example in order toacquaint others skilled in the art with the invention, its principles,and its practical application. Further, the specific embodiments of thepresent invention as set forth are not intended to be exhaustive or tolimit the invention, and that many alternatives, modifications, andvariations will be apparent to those skilled in the art in light of theforegoing examples and detailed description. Accordingly, this inventionis intended to embrace all such alternatives, modifications, andvariations that fall within the spirit and scope of the followingclaims. While some of the examples and descriptions above include someconclusions about the way the invention may function, the inventors donot intend to be bound by those conclusions and functions, but put themforth only as possible explanations in light of current understanding.

[0149] Abbreviations and Definitions

[0150] To facilitate understanding of the invention, a number of termsare defined below. Definitions of certain terms are included here. Anyterm not defined is understood to have the normal meaning used byscientists contemporaneous with the submission of this application.

[0151] The term “expression vector” as used herein refers to nucleicacid sequences containing a desired coding sequence and appropriatenucleic acid sequences necessary for the expression of the operablylinked coding sequence in a particular host organism. Nucleic acidsequences necessary for expression in prokaryotes include a promoter, aribosome binding site, an initiation codon, a stop codon, optionally anoperator sequence and possibly other regulatory sequences. Eukaryoticcells utilize promoters, a Kozak sequence and often enhancers andpolyadenlyation signals. Prokaryotic cells also utilize a Shine-DalgarnoRibosome binding site. The present invention includes vectors orplasmids which can be used as vehicles to transform any viable host cellwith the recombinant DNA expression vector.

[0152] “Operably linked” is intended to mean that the nucleotidesequence of interest is linked to the regulatory sequence(s) in a mannerthat allows for expression of the nucleotide sequence (e.g., in an invitro transcription/translation system or in a host cell when the vectoris introduced into the host cell).

[0153] The term “regulatory sequence” is intended to include promoters,enhancers, and other expression control elements (e.g., polyadenylationsignals). Regulatory sequences include those that direct constitutiveexpression of a nucleotide sequence in many types of host cell and thosethat direct expression of the nucleotide sequence only in certain hostcells (e.g., tissue-specific regulatory sequences).

[0154] The terms “transformation” and “transfection” are intended torefer to a variety of art-recognized techniques for introducing foreignnucleic acid (e.g., DNA) into a host cell, including calcium phosphateor calcium chloride co-precipitation, DEAE-dextran-mediatedtransfection, lipofection, or electroporation. Suitable methods fortransforming or transfecting host cells can be found in laboratorymanuals.

[0155] The term “hydrophilic” when used in reference to amino acidsrefers to those amino acids which have polar and/or charged side chains.Hydrophilic amino acids include lysine, arginine, histidine, aspartate(i.e., aspartic acid), glutamate (i.e., glutamic acid), serine,threonine, cysteine, tyrosine, asparagine and glutamine.

[0156] The term “hydrophobic” when used in reference to amino acidsrefers to those amino acids which have nonpolar side chains. Hydrophobicamino acids include valine, leucine, isoleucine, cysteine andmethionine. Three hydrophobic amino acids have aromatic side chains.Accordingly, the term “aromatic” when used in reference to amino acidsrefers to the three aromatic hydrophobic amino acids phenylalanine,tyrosine and tryptophan.

[0157] The term “fusion protein” refers to polypeptides and proteinswhich consist of a metal ion-affinity linker peptide and a protein orpolypeptide operably linked directly or indirectly to the metalion-affinity peptide. The metal ion-affinity linker peptide may belocated at the amino-terminal portion of the fusion protein or at thecarboxy-terminal protein thus forming an “amino-terminal fusion protein”or a “carboxy-terminal fusion protein,” respectively.

[0158] The terms “metal ion-affinity peptide”, “metal binding peptide”and “linker peptide” are used interchangeably to refer to an amino acidsequence which displays an affinity to metal ions. The minimum length ofthe immobilized metal ion-affinity peptide according to the presentinvention is seven amino acids including four alternating histidines.The most preferred length is seven amino acids including fouralternating histidines.

[0159] The term “enzyme” referred to herein in the context of a cleavageenzyme means a polypeptide or protein which recognizes a specific aminoacid sequence in a polypeptide and cleaves the polypeptide at thescissile bond. In one embodiment of the present invention, enterokinaseis the enzyme which is used to free the fusion protein from theimmobilized metal ion column. In further embodiments, carboxylpeptidaseA, DPP I, DPP II, DPP IV, factor Xa, human renin, TEV, thrombin or VIIIprotease is the enzyme.

[0160] The terms “cleavage site” used herein refers to an amino acidsequence which is recognized and cleaved by an enzyme or chemical meansat the scissile bond.

[0161] The term “scissile bond” referred to herein is the juncture wherecleavage occurs; for example the scissile bond recognized byenterokinase may be the bond following the sequence (Asp₄)-Lys in thespacer peptide or affinity peptide.

[0162] By the term “immobilized metal ion-affinity peptide” as usedherein is meant an amino acid sequence that chelates immobilizeddivalent metal ions of metals selected from the group consisting ofaluminum, cadmium, calcium, cobalt, copper, gallium, iron, nickel,ytterbium and zinc.

[0163] The term “capture ligand” means any ligand or receptor that canbe immobilized or supported on a container or support and used toisolate a cellular component from cellular debris. Some non-limitingexamples of capture ligands that may be used in connection with thepresent invention include: biotin, streptavidin, various metal chelateions, antibodies, various charged particles such as those for use in ionexchange chromatography, various affinity chromatography supports, andvarious hydrophobic groups for use in hydrophobic chromatography.

[0164] For all the nucleotide and amino acid sequences disclosed herein,it is understood that equivalent nucleotides and amino acids can besubstituted into the sequences without affecting the function of thesequences. Such substitutions is within the ability of a person ofordinary skill in the art.

[0165] The procedures disclosed herein which involve the molecularmanipulation of nucleic acids are known to those skilled in the art.

EXAMPLE 1 Construction and Screening of a Metal Ion-Affinity PeptideLibrary

[0166] A pseudo-random glutathione-S-transferase C-terminal peptidelibrary was constructed with the amino acid sequence ofHis-X-His-X-His-X-His where X is any amino acid except Gln, His and Pro.The library vector was constructed from the bacterial expression vectorpGEX-2T. The library was constructed by annealing a pair ofcomplimentary oligonucleotides together. Oligonucleotides wereconstructed as follows: 5′GATCCCATDNDCATDNDCATDNDCATTMC3′ (SEQ ID NO:18) and 5′MTTGTTAATGHNHATGHNHATGHNHATGG3′ (SEQ ID NO: 19) where D isnucleotides A, G, or T, H is nucleotides A, C, or T and N is nucleotidesA, C, T, or G. The 5′ end was phosphorylated with T₄ polynucleotidekinase and the oligonucleotides were annealed together to generate acassette. The cassette was ligated into pGEX-2T, which had been digestedwith EcoRI and BamHI restriction endonucleases. Ligated vector wastransformed into E. coli DH5-α using standard protocols. Transformantswere plated on LB/ampicillin plates (100 mg/L) and incubated overnightat 37° C.

[0167] 900 colonies were picked and placed on 9 master plates. Eachmaster plate contained 100 colonies each and were grown overnight at 37°C. A piece of nitrocellulose was placed onto each of the master plates.This piece of nitrocellulose was then removed and the transferredcolonies were placed onto a LB/ampicillin plate containing 1 mMisopropyl β-D-galactopyranoside (IPTG) to induce the expression of theGST fusion peptides. The cells were allowed to grow for an additional 4hours at 37° C. The nitrocellulose filter was removed from the plate andplaced sequentially on blotting paper containing the following solutionsto lyse the cells in situ:

[0168] (a) 10% SDS for 10 minutes,

[0169] (b) 1.5 M sodium chloride, 0.5 M sodium hydroxide for 5 minutes

[0170] (c) 1.5 M sodium chloride, 0.5 M Tris-HCl pH 7.4 for 5 minutes

[0171] (d) 1.5 M sodium chloride, 0.5 M Tris-HCl pH 7.4 for 5 minutes

[0172] (e) 2×SSC for 15 minutes.

[0173] The filters were dried at ambient temperature followed by anincubation in Tris-buffered saline (TBS) containing 3% non-fat dry milkfor 1 hour at room temperature. Filters were then washed 3×for 5 minuteswith TBS containing 0.05% Tween-20 (TBS-T). To detect clones that werecapable of binding to a metal ion, the filters were incubated withnickel NTA horseradish peroxidase (HRP) at a concentration of 1 mg/ml inTBS-T for 1 hour. The filters were then washed with TBS-T 3×for 5minutes and incubated with 3-3′-5-5′-Tetramethylbenzidine (TMB) todetect the horseradish peroxidase. The reaction was stopped by placingthe filters in water. 250 colonies, which were detected above, werepicked from the master plate and placed into 1 ml of LB/ampicillin andgrown overnight in a 96 deep well plate at 37° C. at 250 rpm on anorbital shaker. 10 μl of the overnight cultures were transferred to afresh aliquot of LB/ampicillin (1 ml) in a 96 deep well plate and grownfor an additional 3 hours. The culture was then induced by adding IPTG(final concentration of 1 mM) and the culture was allowed to grow for anadditional 3 hours prior to harvesting by centrifugation. The media wasdecanted and the cells were frozen overnight at −20° C. in thecollection plate. Cells were lysed with 0.6 ml of CelLytic-B(Sigma-Aldrich product no. B3553) and incubated for 15 minutes at roomtemperature. The cell debris was removed by centrifugation at 3,000×gfor 15 minutes. Two experiments were done in parallel, one on aHis-Select High Sensitivity (HS) nickel coated plate, and the second onHIS-Select High Capacity (HC) nickel coated plate. 0.1 ml of cellextracts of each clone were placed in a HS microwell plate in thepresence of imidazole at a final concentration of 5 mM. This is theselective condition used for screening the different metal ion-affinityclones. HS plates were incubated for 4 hours at room temperature. Plateswere then washed 3×with phosphate-buffered saline (PBS) containing 0.05%Tween 20 (PBS-T). The HS plates were then incubated with anti-GST at1:1,000 dilution in PBS-BSA buffer (0.2 ml/well) for 1 hour at roomtemperature. HS plates were washed 3×with PBS-T. The HS plates were thenincubated with anti-mouse HRP conjugate at 1:10,000 dilution in PBS-BSAbuffer for 1 hour at room temperature. Plates were washed 3×with PBS-T.The plate was then developed with2,2′azino-bis(3-ethylbenzthiazoline-6-sulfonic acid) ABST substrate.Color development was stopped by the addition of sodium azide to a finalconcentration of 2 mM. Absorbance of the plates was read at 405 nm usinga Wallace 1420 plate reader. The HC plates were used to further analyzepotential clones. To further characterize the clones, 0.2 ml of cellextracts were applied to the HC plates and the plates were incubated atambient temperature for 1 hour. The plates were washed with PBS asdescribed above. Twenty-one clones that produced the highest response onthe HS plates were eluted from the corresponding HC plate. The selectedcloned proteins were eluted from the HC plates by incubating at 37° C.for 15 minutes in 50 mM sodium phosphate, 0.3 M sodium chloride and 0.2M imidazole buffer. Eluted proteins were then moved to clean tubes andanalyzed by SDS-PAGE. All 21 clones had the expected molecular weightand were sequence verified.

[0174] These 21 colonies were grown overnight in 1 ml LB/ampicillinmedia at 37° C. at 250 rpm. 100 μl of the overnight cultures weretransferred to 50 ml of fresh LB/ampicillin media and the cultures grownfor an additional 3 hours at 37° C. The cultures were induced with IPTG(final concentration of 1 mM) and the cultures grown for an additional 3hours prior to harvesting by centrifugation.

EXAMPLE 2 Construction of an N-Terminal Metal Ion-Affinity FusionProtein

[0175] Two metal ion-affinity tags were introduced to the N-terminal ofbacterial alkaline phosphatase (BAP). The constructs were constructedfrom the BAP expression vector pFLAG-CTS-BAP. Construction was done byannealing two pair of complimentary oligonucleotides together. Thefollowing oligonucleotides were constructed:5′TATGCATAATCATCGACATGAACATA3′ (SEQ ID NO: 20),5′AGCTTATGTTTATGTCGATGATTATGCA3′ (SEQ ID NO: 21),5′TATGCATAAACATAGACATGGGCATA3′ (SEQ ID NO: 22) and5′AGCTTGATGCCCATGTCTATGTTTATGCA3′ (SEQ ID NO: 23). The oligonucleotideswere annealed together to generate a cassette. The cassette was ligatedinto pFLAG-CTS-BAP, which had been digested with NdeI and HindIIIrestriction endonucleases. Ligated vector was transformed into E. coliDH5-a using standard protocols and plated on LB/ampicillin.

EXAMPLE 3 Expression of an N-Terminal Metal Ion-Affinity Fusion Protein

[0176] MAT-BAP fusion peptide cultures were grown overnight in 1 mlLB/ampicillin at 37° C. 500 μl of overnight cultures were transferred to500 ml of fresh TB media containing ampicillin (100 mg/L). The cultureswere grown for three hours at 37° C. at 250 rpm. Protein expression wasinduced by the addition of IPTG (final concentration of 1 mM). Cultureswere then grown for an additional three hours, harvested bycentrifugation and stored at −70° C. until further use.

EXAMPLE 4 Metal Ion-Affinity Fusion Protein Purification Protocol #1

[0177] Cells were resuspended in 2 ml of TE (50 mM Tris-HCl pH 8.0, 2 mMEDTA). Lysozyme (4 mg/ml in 2 ml of TE) was added to the resuspendedcells and the cells were lysed at ambient temperature for 4 hours. Thecell debris was removed by centrifugation at 27,000×g for 15 minutes.The supernatant was dialyzed overnight against 50 mM Tris-HCl pH 8.0 toremove the EDTA. The dialyzed supernatant was applied to a 1 ml columncontaining a nickel biscarboxy-methyl-cysteine resin (nickel resin). Thecolumn was washed with 4 ml of 50 mM Tris-HCl pH 8.0 and then washedwith 2 ml of 50 mM Tris-HCl pH 8.0, 10 mM imidazole. The column was theneluted 50 mM Tris-HCl pH 8.0 250 mM imidazole. Samples were analyzed forpurity by SDS-PAGE.

EXAMPLE 5 Metal Ion-Affinity Fusion Protein Purification Protocol #2

[0178] Cells were resuspended with CelLytic B (Sigma-Aldrich product no.B3553), and 10 mM imidazole. The cells were solubilized by incubationfor 15 minutes. The cell debris was removed by centrifugation at15,000×g for 5 minutes at room temperature. A 0.5 ml column, containingnickel resin, was equilibrated with 10 column volumes (5 ml) of 50 mMsodium phosphate, pH 8, and 300 mM sodium chloride (column buffer). Thesupernatant was loaded on the column. The column was washed with 10column volumes (5 ml) of 10 mM imidazole in column buffer. The columnwas eluted with 100 mM imidazole in column buffer. The samples wereanalyzed for specificity by SDS-PAGE.

EXAMPLE 6 Metal Ion-Affinity Fusion Protein Purification Protocol #3:Use of Chaotropic Agents

[0179] The cells were resuspended in 100 mM sodium phosphate, pH 8, and8 M urea (denaturant column buffer). The cells were solubilized bysonication three times, 15 seconds each, with a probe sonicator. Celldebris was removed by centrifugation at 15,000×g for 5 minutes at roomtemperature. A 0.5 ml column, containing nickel resin, was equilibratedwith 10 column volumes (5 ml) of the denaturant column buffer. Thesupernatant was loaded on the column and the column was washed with 10column volumes (5 ml) of denaturant column buffer. The column wassequentially eluted with 100 mM sodium phosphate, 8 M urea at pH 7.5,7.0, 6.5, 6.0, 5.5, 5.0 and 4.5. The samples were analyzed forspecificity by SDS-PAGE.

1 23 1 211 PRT Shistosoma japonicum 1 Met Ala Cys Gly His Val Lys LeuIle Tyr Phe Asn Gly Arg Gly Arg 1 5 10 15 Ala Glu Pro Ile Arg Met IleLeu Val Ala Ala Gly Val Glu Phe Glu 20 25 30 Asp Glu Arg Ile Glu Phe GlnAsp Trp Pro Lys Ile Lys Pro Thr Ile 35 40 45 Pro Gly Gly Arg Leu Pro IleVal Lys Ile Thr Asp Lys Arg Gly Asp 50 55 60 Val Lys Thr Met Ser Glu SerLeu Ala Ile Ala Arg Phe Ile Ala Arg 65 70 75 80 Lys His Asn Met Met GlyAsp Thr Asp Asp Glu Tyr Tyr Ile Ile Glu 85 90 95 Lys Met Ile Gly Gln ValGlu Asp Val Glu Ser Asp Tyr His Lys Thr 100 105 110 Leu Ile Lys Pro ProGlu Glu Lys Glu Lys Ile Ser Lys Glu Ile Leu 115 120 125 Asn Gly Lys ValPro Ile Leu Leu Gln Ala Ile Cys Glu Thr Leu Lys 130 135 140 Glu Ser ThrGly Asn Leu Thr Val Gly Asp Lys Val Thr Leu Ala Asp 145 150 155 160 ValVal Leu Ile Ala Ser Ile Asp His Ile Thr Asp Leu Asp Lys Glu 165 170 175Phe Leu Thr Gly Lys Tyr Pro Glu Ile His Lys His Arg Lys His Leu 180 185190 Leu Ala Thr Ser Pro Lys Leu Ala Lys Tyr Leu Ser Glu Arg His Ala 195200 205 Thr Ala Phe 210 2 163 PRT Clostridium cellulovorans 2 Ala AlaThr Ser Ser Met Ser Val Glu Phe Tyr Asn Ser Asn Lys Ser 1 5 10 15 AlaGln Thr Asn Ser Ile Thr Pro Ile Ile Lys Ile Thr Asn Thr Ser 20 25 30 AspSer Asp Leu Asn Leu Asn Asp Val Lys Val Arg Tyr Thr Tyr Tyr 35 40 45 ThrSer Asp Gly Thr Gln Gly Gln Thr Phe Trp Cys Asp His Ala Gly 50 55 60 AlaLeu Leu Gly Asn Ser Tyr Val Asp Asn Thr Ser Lys Val Thr Ala 65 70 75 80Asn Phe Val Lys Glu Thr Ala Ser Pro Thr Ser Thr Tyr Asp Thr Tyr 85 90 95Val Glu Phe Gly Phe Ala Ser Gly Ala Ala Thr Leu Lys Lys Gly Gln 100 105110 Phe Ile Thr Ile Gln Gly Arg Ile Thr Lys Ser Asp Trp Ser Asn Tyr 115120 125 Thr Gln Thr Asn Asp Tyr Ser Phe Asp Ala Ser Ser Ser Thr Pro Val130 135 140 Val Asn Pro Lys Val Thr Gly Tyr Ile Gly Gly Ala Lys Val LeuGly 145 150 155 160 Thr Ala Pro 3 396 PRT Escherichia coli 3 Met Lys IleLys Thr Gly Ala Arg Ile Leu Ala Leu Ser Ala Leu Thr 1 5 10 15 Thr MetMet Phe Ser Ala Ser Ala Leu Ala Lys Ile Glu Glu Gly Lys 20 25 30 Leu ValIle Trp Ile Asn Gly Asp Lys Gly Tyr Asn Gly Leu Ala Glu 35 40 45 Val GlyLys Lys Phe Glu Lys Asp Thr Gly Ile Lys Val Thr Val Glu 50 55 60 His ProAsp Lys Leu Glu Glu Lys Phe Pro Gln Val Ala Ala Thr Gly 65 70 75 80 AspGly Pro Asp Ile Ile Phe Trp Ala His Asp Arg Phe Gly Gly Tyr 85 90 95 AlaGln Ser Gly Leu Leu Ala Glu Ile Thr Pro Asp Lys Ala Phe Gln 100 105 110Asp Lys Leu Tyr Pro Phe Thr Trp Asp Ala Val Arg Tyr Asn Gly Lys 115 120125 Leu Ile Ala Tyr Pro Ile Ala Val Glu Ala Leu Ser Leu Ile Tyr Asn 130135 140 Lys Asp Leu Leu Pro Asn Pro Pro Lys Thr Trp Glu Glu Ile Pro Ala145 150 155 160 Leu Asp Lys Glu Leu Lys Ala Lys Gly Lys Ser Ala Leu MetPhe Asn 165 170 175 Leu Gln Glu Pro Tyr Phe Thr Trp Pro Leu Ile Ala AlaAsp Gly Gly 180 185 190 Tyr Ala Phe Lys Tyr Glu Asn Gly Lys Tyr Asp IleLys Asp Val Gly 195 200 205 Val Asp Asn Ala Gly Ala Lys Ala Gly Leu ThrPhe Leu Val Asp Leu 210 215 220 Ile Lys Asn Lys His Met Asn Ala Asp ThrAsp Tyr Ser Ile Ala Glu 225 230 235 240 Ala Ala Phe Asn Lys Gly Glu ThrAla Met Thr Ile Asn Gly Pro Trp 245 250 255 Ala Trp Ser Asn Ile Asp ThrSer Lys Val Asn Tyr Gly Val Thr Val 260 265 270 Leu Pro Thr Phe Lys GlyGln Pro Ser Lys Pro Phe Val Gly Val Leu 275 280 285 Ser Ala Gly Ile AsnAla Ala Ser Pro Asn Lys Glu Leu Ala Lys Glu 290 295 300 Phe Leu Glu AsnTyr Leu Leu Thr Asp Glu Gly Leu Glu Ala Val Asn 305 310 315 320 Lys AspLys Pro Leu Gly Ala Val Ala Leu Lys Ser Tyr Glu Glu Glu 325 330 335 LeuAla Lys Asp Pro Arg Ile Ala Ala Thr Met Glu Asn Ala Gln Lys 340 345 350Gly Glu Ile Met Pro Asn Ile Pro Gln Met Ser Ala Phe Trp Tyr Ala 355 360365 Val Arg Thr Ala Val Ile Asn Ala Ala Ser Gly Arg Gln Thr Val Asp 370375 380 Glu Ala Leu Lys Asp Ala Gln Thr Arg Ile Thr Lys 385 390 395 4524 PRT Staphylococcus aureus 4 Met Lys Lys Lys Asn Ile Tyr Ser Ile ArgLys Leu Gly Val Gly Ile 1 5 10 15 Ala Ser Val Thr Leu Gly Thr Leu LeuIle Ser Gly Gly Val Thr Pro 20 25 30 Ala Ala Asn Ala Ala Gln His Asp GluAla Gln Gln Asn Ala Phe Tyr 35 40 45 Gln Val Leu Asn Met Pro Asn Leu AsnAla Asp Gln Arg Asn Gly Phe 50 55 60 Ile Gln Ser Leu Lys Asp Asp Pro SerGln Ser Ala Asn Val Leu Gly 65 70 75 80 Glu Ala Gln Lys Leu Asn Asp SerGln Ala Pro Lys Ala Asp Ala Gln 85 90 95 Gln Asn Asn Phe Asn Lys Asp GlnGln Ser Ala Phe Tyr Glu Ile Leu 100 105 110 Asn Met Pro Asn Leu Asn GluAla Gln Arg Asn Gly Phe Ile Gln Ser 115 120 125 Leu Lys Asp Asp Pro SerGln Ser Thr Asn Val Leu Gly Glu Ala Lys 130 135 140 Lys Leu Asn Glu SerGln Ala Pro Lys Ala Asp Asn Asn Phe Asn Lys 145 150 155 160 Glu Gln GlnAsn Ala Phe Tyr Glu Ile Leu Asn Met Pro Asn Leu Asn 165 170 175 Glu GluGln Arg Asn Gly Phe Ile Gln Ser Leu Lys Asp Asp Pro Ser 180 185 190 GlnSer Ala Asn Leu Leu Ser Glu Ala Lys Lys Leu Asn Glu Ser Gln 195 200 205Ala Pro Lys Ala Asp Asn Lys Phe Asn Lys Glu Gln Gln Asn Ala Phe 210 215220 Tyr Glu Ile Leu His Leu Pro Asn Leu Asn Glu Glu Gln Arg Asn Gly 225230 235 240 Phe Ile Gln Ser Leu Lys Asp Asp Pro Ser Gln Ser Ala Asn LeuLeu 245 250 255 Ala Glu Ala Lys Lys Leu Asn Asp Ala Gln Ala Pro Lys AlaAsp Asn 260 265 270 Lys Phe Asn Lys Glu Gln Gln Asn Ala Phe Tyr Glu IleLeu His Leu 275 280 285 Pro Asn Leu Thr Glu Glu Gln Arg Asn Gly Phe IleGln Ser Leu Lys 290 295 300 Asp Asp Pro Ser Val Ser Lys Glu Ile Leu AlaGlu Ala Lys Lys Leu 305 310 315 320 Asn Asp Ala Gln Ala Pro Lys Glu GluAsp Asn Asn Lys Pro Gly Lys 325 330 335 Glu Asp Asn Asn Lys Pro Gly LysGlu Asp Asn Asn Lys Pro Gly Lys 340 345 350 Glu Asp Asn Asn Lys Pro GlyLys Glu Asp Asn Asn Lys Pro Gly Lys 355 360 365 Glu Asp Asn Asn Lys ProGly Lys Glu Asp Gly Asn Lys Pro Gly Lys 370 375 380 Glu Asp Asn Lys LysPro Gly Lys Glu Asp Gly Asn Lys Pro Gly Lys 385 390 395 400 Glu Asp AsnLys Lys Pro Gly Lys Glu Asp Gly Asn Lys Pro Gly Lys 405 410 415 Glu AspGly Asn Lys Pro Gly Lys Glu Asp Gly Asn Gly Val His Val 420 425 430 ValLys Pro Gly Asp Thr Val Asn Asp Ile Ala Lys Ala Asn Gly Thr 435 440 445Thr Ala Asp Lys Ile Ala Ala Asp Asn Lys Leu Ala Asp Lys Asn Met 450 455460 Ile Lys Pro Gly Gln Glu Leu Val Val Asp Lys Lys Gln Pro Ala Asn 465470 475 480 His Ala Asp Ala Asn Lys Ala Gln Ala Leu Pro Glu Thr Gly GluGlu 485 490 495 Asn Pro Phe Ile Gly Thr Thr Val Phe Gly Gly Leu Ser LeuAla Leu 500 505 510 Gly Ala Ala Leu Leu Ala Gly Arg Arg Arg Glu Leu 515520 5 448 PRT Streptococcus 5 Met Glu Lys Glu Lys Lys Val Lys Tyr PheLeu Arg Lys Ser Ala Phe 1 5 10 15 Gly Leu Ala Ser Val Ser Ala Ala PheLeu Val Gly Ser Thr Val Phe 20 25 30 Ala Val Asp Ser Pro Ile Glu Asp ThrPro Ile Ile Arg Asn Gly Gly 35 40 45 Glu Leu Thr Asn Leu Leu Gly Asn SerGlu Thr Thr Leu Ala Leu Arg 50 55 60 Asn Glu Glu Ser Ala Thr Ala Asp LeuThr Ala Ala Ala Val Ala Asp 65 70 75 80 Thr Val Ala Ala Ala Ala Ala GluAsn Ala Gly Ala Ala Ala Trp Glu 85 90 95 Ala Ala Ala Ala Ala Asp Ala LeuAla Lys Ala Lys Ala Asp Ala Leu 100 105 110 Lys Glu Phe Asn Lys Tyr GlyVal Ser Asp Tyr Tyr Lys Asn Leu Ile 115 120 125 Asn Asn Ala Lys Thr ValGlu Gly Ile Lys Asp Leu Gln Ala Gln Val 130 135 140 Val Glu Ser Ala LysLys Ala Arg Ile Ser Glu Ala Thr Asp Gly Leu 145 150 155 160 Ser Asp PheLeu Lys Ser Gln Thr Pro Ala Glu Asp Thr Val Lys Ser 165 170 175 Ile GluLeu Ala Glu Ala Lys Val Leu Ala Asn Arg Glu Leu Asp Lys 180 185 190 TyrGly Val Ser Asp Tyr His Lys Asn Leu Ile Asn Asn Ala Lys Thr 195 200 205Val Glu Gly Val Lys Glu Leu Ile Asp Glu Ile Leu Ala Ala Leu Pro 210 215220 Lys Thr Asp Thr Tyr Lys Leu Ile Leu Asn Gly Lys Thr Leu Lys Gly 225230 235 240 Glu Thr Thr Thr Glu Ala Val Asp Ala Ala Thr Ala Glu Lys ValPhe 245 250 255 Lys Gln Tyr Ala Asn Asp Asn Gly Val Asp Gly Glu Trp ThrTyr Asp 260 265 270 Asp Ala Thr Lys Thr Phe Thr Val Thr Glu Lys Pro GluVal Ile Asp 275 280 285 Ala Ser Glu Leu Thr Pro Ala Val Thr Thr Tyr LysLeu Val Ile Asn 290 295 300 Gly Lys Thr Leu Lys Gly Glu Thr Thr Thr LysAla Val Asp Ala Glu 305 310 315 320 Thr Ala Glu Lys Ala Phe Lys Gln TyrAla Asn Asp Asn Gly Val Asp 325 330 335 Gly Val Trp Thr Tyr Asp Asp AlaThr Lys Thr Phe Thr Val Thr Glu 340 345 350 Met Val Thr Glu Val Pro GlyAsp Ala Pro Thr Glu Pro Glu Lys Pro 355 360 365 Glu Ala Ser Ile Pro LeuVal Pro Leu Thr Pro Ala Thr Pro Ile Ala 370 375 380 Lys Asp Asp Ala LysLys Asp Asp Thr Lys Lys Glu Asp Ala Lys Lys 385 390 395 400 Pro Glu AlaLys Lys Asp Asp Ala Lys Lys Ala Glu Thr Leu Pro Thr 405 410 415 Thr GlyGlu Gly Ser Asn Pro Phe Phe Thr Ala Ala Ala Leu Ala Val 420 425 430 MetAla Gly Ala Gly Ala Leu Ala Val Ala Ser Lys Arg Lys Glu Asp 435 440 4456 192 PRT Homo sapiens 6 Met Ala Pro Ser Leu Ser Ala Met Thr Pro Trp ThrPro Gly Pro Ser 1 5 10 15 Trp Ser Ser Val Tyr Met Thr Cys Val Trp SerVal Gly Ser Gly Ser 20 25 30 Ala Cys Ala Val Ala Ser Ala Pro Met Pro ArgPro Val Trp Ser Leu 35 40 45 Ala Ser Arg Leu Gly Thr Gly Asp His Gln ProThr Ala Pro Cys Pro 50 55 60 Ala Leu Pro Thr Ala Ala Met Ser Ser Ala AlaLeu Leu Ala Arg Pro 65 70 75 80 Pro Ala Thr Gly Leu Arg Arg Arg Pro ThrAla Pro Gly Ala Pro Ala 85 90 95 Trp Arg Ala Ala Cys Ala Ser Gln Ala SerTrp Pro Ala Ala Ala Pro 100 105 110 Ala Cys Arg Pro Arg Arg Val Ala AlaPro Ser Arg Val Ser Ser Ser 115 120 125 Leu Arg Ala Arg Lys Cys Gly ArgThr Ser Cys Ala Lys Gly Ala Ala 130 135 140 Pro Ala Thr Ala Pro Pro IleArg Ser Pro Ala Ala Thr Ser Arg Ala 145 150 155 160 Ala Arg Arg Val SerAla Ala Ala Ser Arg Thr Ala Ser Trp Ala Ala 165 170 175 Thr Pro Ile AlaSer Gly Pro Ala Arg Gly Pro Gly Thr His Thr Met 180 185 190 7 216 PRTEscherichia coli 7 Met Asn Phe Asn Lys Ile Asp Leu Asp Asn Trp Lys ArgLys Glu Ile 1 5 10 15 Phe Asn His Tyr Leu Asn Gln Gln Thr Thr Phe SerIle Thr Thr Glu 20 25 30 Ile Asp Ile Ser Val Leu Tyr Arg Asn Ile Lys GlnGlu Gly Tyr Lys 35 40 45 Phe Tyr Pro Ala Phe Ile Phe Leu Val Thr Arg ValIle Asn Ser Asn 50 55 60 Thr Ala Phe Arg Thr Gly Tyr Asn Ser Asp Gly GluLeu Gly Tyr Trp 65 70 75 80 Asp Lys Leu Glu Pro Leu Tyr Thr Ile Phe AspGly Val Ser Lys Thr 85 90 95 Phe Ser Gly Ile Trp Thr Pro Val Lys Asn AspPhe Lys Glu Phe Tyr 100 105 110 Asp Leu Tyr Leu Ser Asp Val Glu Lys TyrAsn Gly Ser Gly Lys Leu 115 120 125 Phe Pro Lys Thr Pro Ile Pro Glu AsnAla Phe Ser Leu Ser Ile Ile 130 135 140 Pro Trp Thr Ser Phe Thr Gly PheAsn Leu Asn Ile Asn Asn Asn Ser 145 150 155 160 Asn Tyr Leu Leu Pro IleIle Thr Ala Gly Lys Phe Ile Asn Lys Gly 165 170 175 Asn Ser Ile Tyr LeuPro Leu Ser Leu Gln Val His His Ser Val Cys 180 185 190 Asp Gly Tyr HisAla Gly Leu Phe Met Asn Ser Ile Gln Glu Leu Ser 195 200 205 Asp Arg ProAsn Asp Trp Leu Leu 210 215 8 160 PRT Streptomyces avidinii 8 Met AspPro Ser Lys Asp Ser Lys Ala Gln Val Ser Ala Ala Glu Ala 1 5 10 15 GlyIle Thr Gly Thr Trp Tyr Asn Gln Leu Gly Ser Thr Phe Ile Val 20 25 30 ThrAla Gly Ala Asp Gly Ala Leu Thr Gly Thr Tyr Glu Ser Ala Val 35 40 45 GlyAsn Ala Glu Ser Arg Tyr Val Leu Thr Gly Arg Tyr Asp Ser Ala 50 55 60 ProAla Thr Asp Gly Ser Gly Thr Ala Leu Gly Trp Thr Val Ala Trp 65 70 75 80Lys Asn Asn Tyr Arg Asn Ala His Ser Ala Thr Thr Trp Ser Gly Gln 85 90 95Tyr Val Gly Gly Ala Glu Ala Arg Ile Asn Thr Gln Trp Leu Leu Thr 100 105110 Ser Gly Thr Thr Glu Ala Asn Ala Trp Lys Ser Thr Leu Val Gly His 115120 125 Asp Thr Phe Thr Lys Val Lys Pro Ser Ala Ala Ser Ile Asp Ala Ala130 135 140 Lys Lys Ala Gly Val Asn Asn Gly Asn Pro Leu Asp Ala Val GlnGln 145 150 155 160 9 1024 PRT Escherichia coli 9 Met Thr Met Ile ThrAsp Ser Leu Ala Val Val Leu Gln Arg Arg Asp 1 5 10 15 Trp Glu Asn ProGly Val Thr Gln Leu Asn Arg Leu Ala Ala His Pro 20 25 30 Pro Phe Ala SerTrp Arg Asn Ser Glu Glu Ala Arg Thr Asp Arg Pro 35 40 45 Ser Gln Gln LeuArg Ser Leu Asn Gly Glu Trp Arg Phe Ala Trp Phe 50 55 60 Pro Ala Pro GluAla Val Pro Glu Ser Trp Leu Glu Cys Asp Leu Pro 65 70 75 80 Glu Ala AspThr Val Val Val Pro Ser Asn Trp Gln Met His Gly Tyr 85 90 95 Asp Ala ProIle Tyr Thr Asn Val Thr Tyr Pro Ile Thr Val Asn Pro 100 105 110 Pro PheVal Pro Thr Glu Asn Pro Thr Gly Cys Tyr Ser Leu Thr Phe 115 120 125 AsnVal Asp Glu Ser Trp Leu Gln Glu Gly Gln Thr Arg Ile Ile Phe 130 135 140Asp Gly Val Asn Ser Ala Phe His Leu Trp Cys Asn Gly Arg Trp Val 145 150155 160 Gly Tyr Gly Gln Asp Ser Arg Leu Pro Ser Glu Phe Asp Leu Ser Ala165 170 175 Phe Leu Arg Ala Gly Glu Asn Arg Leu Ala Val Met Val Leu ArgTrp 180 185 190 Ser Asp Gly Ser Tyr Leu Glu Asp Gln Asp Met Trp Arg MetSer Gly 195 200 205 Ile Phe Arg Asp Val Ser Leu Leu His Lys Pro Thr ThrGln Ile Ser 210 215 220 Asp Phe His Val Ala Thr Arg Phe Asn Asp Asp PheSer Arg Ala Val 225 230 235 240 Leu Glu Ala Glu Val Gln Met Cys Gly GluLeu Arg Asp Tyr Leu Arg 245 250 255 Val Thr Val Ser Leu Trp Gln Gly GluThr Gln Val Ala Ser Gly Thr 260 265 270 Ala Pro Phe Gly Gly Glu Ile IleAsp Glu Arg Gly Gly Tyr Ala Asp 275 280 285 Arg Val Thr Leu Arg Leu AsnVal Glu Asn Pro Lys Leu Trp Ser Ala 290 295 300 Glu Ile Pro Asn Leu TyrArg Ala Val Val Glu Leu His Thr Ala Asp 305 310 315 320 Gly Thr Leu IleGlu Ala Glu Ala Cys Asp Val Gly Phe Arg Glu Val 325 330 335 Arg Ile GluAsn Gly Leu Leu Leu Leu Asn Gly Lys Pro Leu Leu Ile 340 345 350 Arg GlyVal Asn Arg His Glu His His Pro Leu His Gly Gln Val Met 355 360 365 AspGlu Gln Thr Met Val Gln Asp Ile Leu Leu Met Lys Gln Asn Asn 370 375 380Phe Asn Ala Val Arg Cys Ser His Tyr Pro Asn His Pro Leu Trp Tyr 385 390395 400 Thr Leu Cys Asp Arg Tyr Gly Leu Tyr Val Val Asp Glu Ala Asn Ile405 410 415 Glu Thr His Gly Met Val Pro Met Asn Arg Leu Thr Asp Asp ProArg 420 425 430 Trp Leu Pro Ala Met Ser Glu Arg Val Thr Arg Met Val GlnArg Asp 435 440 445 Arg Asn His Pro Ser Val Ile Ile Trp Ser Leu Gly AsnGlu Ser Gly 450 455 460 His Gly Ala Asn His Asp Ala Leu Tyr Arg Trp IleLys Ser Val Asp 465 470 475 480 Pro Ser Arg Pro Val Gln Tyr Glu Gly GlyGly Ala Asp Thr Thr Ala 485 490 495 Thr Asp Ile Ile Cys Pro Met Tyr AlaArg Val Asp Glu Asp Gln Pro 500 505 510 Phe Pro Ala Val Pro Lys Trp SerIle Lys Lys Trp Leu Ser Leu Pro 515 520 525 Gly Glu Thr Arg Pro Leu IleLeu Cys Glu Tyr Ala His Ala Met Gly 530 535 540 Asn Ser Leu Gly Gly PheAla Lys Tyr Trp Gln Ala Phe Arg Gln Tyr 545 550 555 560 Pro Arg Leu GlnGly Gly Phe Val Trp Asp Trp Val Asp Gln Ser Leu 565 570 575 Ile Lys TyrAsp Glu Asn Gly Asn Pro Trp Ser Ala Tyr Gly Gly Asp 580 585 590 Phe GlyAsp Thr Pro Asn Asp Arg Gln Phe Cys Met Asn Gly Leu Val 595 600 605 PheAla Asp Arg Thr Pro His Pro Ala Leu Thr Glu Ala Lys His Gln 610 615 620Gln Gln Phe Phe Gln Phe Arg Leu Ser Gly Gln Thr Ile Glu Val Thr 625 630635 640 Ser Glu Tyr Leu Phe Arg His Ser Asp Asn Glu Leu Leu His Trp Met645 650 655 Val Ala Leu Asp Gly Lys Pro Leu Ala Ser Gly Glu Val Pro LeuAsp 660 665 670 Val Ala Pro Gln Gly Lys Gln Leu Ile Glu Leu Pro Glu LeuPro Gln 675 680 685 Pro Glu Ser Ala Gly Gln Leu Trp Leu Thr Val Arg ValVal Gln Pro 690 695 700 Asn Ala Thr Ala Trp Ser Glu Ala Gly His Ile SerAla Trp Gln Gln 705 710 715 720 Trp Arg Leu Ala Glu Asn Leu Ser Val ThrLeu Pro Ala Ala Ser His 725 730 735 Ala Ile Pro His Leu Thr Thr Ser GluMet Asp Phe Cys Ile Glu Leu 740 745 750 Gly Asn Lys Arg Trp Gln Phe AsnArg Gln Ser Gly Phe Leu Ser Gln 755 760 765 Met Trp Ile Gly Asp Lys LysGln Leu Leu Thr Pro Leu Arg Asp Gln 770 775 780 Phe Thr Arg Ala Pro LeuAsp Asn Asp Ile Gly Val Ser Glu Ala Thr 785 790 795 800 Arg Ile Asp ProAsn Ala Trp Val Glu Arg Trp Lys Ala Ala Gly His 805 810 815 Tyr Gln AlaGlu Ala Ala Leu Leu Gln Cys Thr Ala Asp Thr Leu Ala 820 825 830 Asp AlaVal Leu Ile Thr Thr Ala His Ala Trp Gln His Gln Gly Lys 835 840 845 ThrLeu Phe Ile Ser Arg Lys Thr Tyr Arg Ile Asp Gly Ser Gly Gln 850 855 860Met Ala Ile Thr Val Asp Val Glu Val Ala Ser Asp Thr Pro His Pro 865 870875 880 Ala Arg Ile Gly Leu Asn Cys Gln Leu Ala Gln Val Ala Glu Arg Val885 890 895 Asn Trp Leu Gly Leu Gly Pro Gln Glu Asn Tyr Pro Asp Arg LeuThr 900 905 910 Ala Ala Cys Phe Asp Arg Trp Asp Leu Pro Leu Ser Asp MetTyr Thr 915 920 925 Pro Tyr Val Phe Pro Ser Glu Asn Gly Leu Arg Cys GlyThr Arg Glu 930 935 940 Leu Asn Tyr Gly Pro His Gln Trp Arg Gly Asp PheGln Phe Asn Ile 945 950 955 960 Ser Arg Tyr Ser Gln Gln Gln Leu Met GluThr Ser His Arg His Leu 965 970 975 Leu His Ala Glu Glu Gly Thr Trp LeuAsn Ile Asp Gly Phe His Met 980 985 990 Gly Ile Gly Gly Asp Asp Ser TrpSer Pro Ser Val Ser Ala Glu Phe 995 1000 1005 Gln Leu Ser Ala Gly ArgTyr His Tyr Gln Leu Val Trp Cys Gln 1010 1015 1020 Lys 10 238 PRTAequorea victoria 10 Met Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val ProIle Leu Val 1 5 10 15 Glu Leu Asp Gly Asp Val Asn Gly Gln Lys Phe SerVal Ser Gly Glu 20 25 30 Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr LeuLys Phe Ile Cys 35 40 45 Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr LeuVal Thr Thr Phe 50 55 60 Ser Tyr Gly Val Gln Cys Phe Ser Arg Tyr Pro AspHis Met Lys Gln 65 70 75 80 His Asp Phe Phe Lys Ser Ala Met Pro Glu GlyTyr Val Gln Glu Arg 85 90 95 Thr Ile Phe Tyr Lys Asp Asp Gly Asn Tyr LysThr Arg Ala Glu Val 100 105 110 Lys Phe Glu Gly Asp Thr Leu Val Asn ArgIle Glu Leu Lys Gly Ile 115 120 125 Asp Phe Lys Glu Asp Gly Asn Ile LeuGly His Lys Met Glu Tyr Asn 130 135 140 Tyr Asn Ser His Asn Val Tyr IleMet Ala Asp Lys Pro Lys Asn Gly 145 150 155 160 Ile Lys Val Asn Phe LysIle Arg His Asn Ile Lys Asp Gly Ser Val 165 170 175 Gln Leu Ala Asp HisTyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro 180 185 190 Val Leu Leu ProAsp Asn His Tyr Leu Ser Thr Gln Ser Ala Leu Ser 195 200 205 Lys Asp ProAsn Glu Lys Arg Asp His Met Ile Leu Leu Glu Phe Val 210 215 220 Thr AlaAla Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys 225 230 235 11 109 PRTEscherichia coli 11 Met Ser Asp Lys Ile Ile His Leu Thr Asp Asp Ser PheAsp Thr Asp 1 5 10 15 Val Leu Lys Ala Asp Gly Ala Ile Leu Val Asp PheTrp Ala Glu Trp 20 25 30 Cys Gly Pro Cys Lys Met Ile Ala Pro Ile Leu AspGlu Ile Ala Asp 35 40 45 Glu Tyr Gln Gly Lys Leu Thr Val Ala Lys Leu AsnIle Asp Gln Asn 50 55 60 Pro Gly Thr Ala Pro Lys Tyr Gly Ile Arg Gly IlePro Thr Leu Leu 65 70 75 80 Leu Phe Lys Asn Gly Glu Val Ala Ala Thr LysVal Gly Ala Leu Ser 85 90 95 Lys Gly Gln Leu Lys Glu Phe Leu Asp Ala AsnLeu Ala 100 105 12 13 PRT Oryctolagus cuniculus 12 Ile Ala Val Ser AlaAla Asn Arg Phe Lys Lys Ile Ser 1 5 10 13 10 PRT artificial sequencesynthetic c-myc epitope 13 Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu 1 510 14 7 PRT artificial sequence synthetic HA epitope 14 Tyr Pro Tyr AspVal Tyr Ala 1 5 15 8 PRT artificial sequence synthetic FLAG sequence 15Asp Tyr Lys Asp Asp Asp Asp Lys 1 5 16 8 PRT artificial sequenceXpressTM leader peptide 16 Asp Leu Tyr Asp Asp Asp Asp Lys 1 5 17 23 PRTartificial sequence synthetic 3X FLAG sequence 17 Met Asp Tyr Lys AspHis Asp Gly Asp Tyr Lys Asp His Asp Ile Asp 1 5 10 15 Tyr Lys Asp AspAsp Asp Lys 20 18 30 DNA artificial sequence chemically synthesizedoligonucleotide 18 gatcccatdn dcatdndcat dndcattaac 30 19 30 DNAartificial sequence chemically synthesized oligonucleotide 19 aattgttaatghnhatghnh atghnhatgg 30 20 26 DNA artificial sequence chemicallysynthesized oligonucleotide 20 tatgcataat catcgacatg aacata 26 21 28 DNAartificial sequence chemically synthesized oligonucleotide 21 agcttatgtttatgtcgatg attatgca 28 22 26 DNA artificial sequence chemicallysynthesized oligonucleotide 22 tatgcataaa catagacatg ggcata 26 23 29 DNAartificial sequence chemically synthesized oligonucleotide 23 agcttgatgcccatgtctat gtttatgca 29

We claim:
 1. A polypeptide, protein or protein fragment represented bythe formula R₁-Sp₁-(His-Z₁-His-Arg-His-Z₂-His)-Sp₂—R₂, wherein(His-Z₁-His-Arg-His-Z₂-His) is a metal ion-affinity peptide, R₁ ishydrogen, a polypeptide, protein or protein fragment, Sp₁ is a covalentbond or a spacer comprising at least one amino acid residue, R₂ ishydrogen, a polypeptide, protein or protein fragment, Sp₂ is a covalentbond or a spacer comprising at least one amino acid residue, Z₁ is anamino acid residue selected from the group consisting of Ala, Arg, Asn,Asp, Gin, Glu, Ile, Lys, Phe, Pro, Ser, Thr, Trp, and Val, and Z₂ is anamino acid residue selected from the group consisting of Ala, Arg, Asn,Asp, Cys, Gln, Glu, Gly, Ile, Leu, Lys, Met, Pro, Ser, Thr, Tyr, andVal.
 2. The peptide of claim 1 wherein Z₁ is selected from the groupconsisting of Ala, Asn, lie, Lys, Phe, Ser, Thr, and Val, Z₂ is selectedfrom the group consisting of Ala, Asn, Gly, Lys, Ser, Thr and Tyr. 3.The peptide of claim 1 wherein Z₁ is selected from the group consistingof Asn and Lys, Z₂ is selected from the group consisting of Gly and Lys.4. The peptide of claim 1, wherein Z₁ is Ile and Z₂ is Asn.
 5. Thepeptide of claim 1, wherein Z₁ is Thr and Z₂ is Ser.
 6. The peptide ofclaim 1, wherein Z₁ is Ser and Z₂ is Tyr.
 7. The peptide of claim 1,wherein Z₁ is Val and Z₂ is Ala.
 8. The peptide of claim 1, wherein Z₁is Ala and Z₂ is Lys.
 9. The peptide of claim 1, wherein Z₁ is Asn andZ₂ is Lys.
 10. The peptide of claim 1 wherein Z₁ is Lys and Z₂ is Gly.11. The peptide of claim 1 wherein R₁ or R₂ is hydrogen.
 12. The peptideof claim 1 wherein R₁ or R₂ is an amino acid residue.
 13. The peptide ofclaim 1 wherein Sp₁ or Sp₂ is a spacer comprising a proteolytic cleavagesite, a fusion protein, a secretion sequence, a leader sequence forcellular targeting an antibody epitope or an internal ribosomalsequences.
 14. The peptide of claim 1 wherein Sp₁ or Sp₂ is a spacercomprising a proteolytic cleavage site. 15 The peptide of claim 14wherein the proteolytic cleavage site is cleaved with enterokinase. 16.The peptide of claim 1 wherein any one of Sp₁, Sp₂, R₁ and R₂ comprisesat least one of the amino acid sequences selected from the groupconsisting of SEQ ID NOS:1-17.
 17. The peptide of claim 1 wherein Sp₁ orSp₂ is a spacer comprising the enzyme glutathione-S-transferase of theparasite helminth Schistosoma japonicum.
 18. The peptide of claim 1wherein Sp₁ or Sp₂ is a spacer comprising the amino acid sequenceDYKDDDDK.
 19. The peptide of claim 1 wherein Sp₁ or Sp₂ is a spacercomprising the amino acid sequence DLYDDDDK.
 20. The peptide of claim 1wherein Sp₁ or Sp₂ is a spacer comprising the amino acid sequenceMet-Asp-Tyr-Lys-Asp-His-Asp-Gly-Asp-Tyr-Lys-Asp-His-Asp-Ile-Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys.21. The peptide of claim 1 wherein Sp₁ or Sp₂ is a spacer comprising apolypeptide possessing an amino acid having at least 70% homology to anyone of the amino acid sequences disclosed in SEQ ID NOS: 1-17 and havingthe same binding characteristics as said amino acid.
 22. A process forseparating a polypeptide, protein or protein fragment protein of claim 1from a liquid mixture, the process comprising contacting immobilizedmetal ions with the liquid mixture to bind the polypeptide, protein orprotein fragment to the immobilized metal ions.
 23. The process of claim22 further comprising releasing the polypeptide, protein or proteinfragment or a portion thereof from the immobilized metal ions.
 24. Theprocess of claim 23 wherein the polypeptide, protein or protein fragmentcomprises a proteolytic cleavage site adjacent to the metal ion-affinitypeptide and the process further comprises contacting the boundpolypeptide, protein or protein fragment with a proteolytic enzyme whichcleaves the polypeptide, protein or protein fragment at the cleavagesite thereby releasing a portion of the polypeptide, protein or proteinfragment from the immobilized metal ions.
 25. The process of claim 24wherein the metal ions are immobilized on a resin derivatized by anitrilotriacetic acid derivative.
 26. The process of claim 24 whereinthe immobilized metal ions are the divalent ions of metals selected fromthe group consisting of aluminum, cadmium, calcium, cobalt, copper,gallium, iron, nickel, ytterbium and zinc.
 27. The process of claim 24wherein the immobilized metal ions are the divalent ions of metalsselected from the group consisting of nickel and cobalt.
 28. The processof claim 24 wherein the immobilized metal ions are nickel (II).
 29. Theprocess of claim 22, wherein the metal ions are a component of a metalchelate derived from a composition corresponding to formula (1):

wherein Q is a carrier; S¹ is a spacer; L is -A-T-CH(X)— or —C(═O)—; Ais an ether, thioether, selenoether, or amide linkage; T is a bond orsubstituted or unsubstituted alkyl or alkenyl; X is —(CH₂)_(k)CH₃,—(CH₂)_(k)COOH, —(CH₂)_(k)SO₃H, —(CH₂)_(k)PO₃H₂, —(CH₂)_(k)N(J)₂, or—(CH₂)_(k)P(J)₂, preferably —(CH₂)_(k)COOH or —(CH₂)_(k)SO₃H; k is aninteger from 0 to 2; J is hydrocarbyl or substituted hydrocarbyl; Y is—COOH, —H, —SO₃H, —PO₃H₂, —N(J)₂, or —P(J)₂, preferably, —COOH; Z is—COOH, —H, —SO₃H, —PO₃H₂, —N(J)₂, or —P(J)₂, preferably, —COOH; and i isan integer from 0 to 4, preferably 1 or
 2. 30. The process of claim 29,wherein the metal chelate is derived from a composition selected fromthe group consisting of:

wherein Q is a carrier and Ac is acetyl.
 31. The process of claim 22wherein the metal ions are a component of a metal chelate derived from acomposition corresponding to the formula:

wherein Q is a carrier S² is —O—CH₂—CH(OH)—CH₂— or —O—CO— x is 2, 3, or4.
 32. The process of claim 30, wherein the metal chelate comprises anion selected from the group consisting of Ni²⁺, Zn²⁺, Cu²⁺, Fe³⁺, Co²+,Ca²⁺, Al³⁺, Mg²⁺, and Mn²⁺.
 33. The process of claim 31, wherein themetal chelate comprises an ion selected from the group consisting ofNi²⁺, Zn²⁺, Cu²⁺, Fe³⁺, Co²⁺, Ca²⁺, Al³⁺, Mg²⁺, and Mn²⁺.
 34. Theprocess of claim 30, wherein the metal chelate comprises Ni²⁺.
 35. Theprocess of claim 31, wherein the metal chelate comprises Ni²⁺.
 36. Arecombinant vector comprising a vector and a DNA sequence coding for thepolypeptide, protein or protein fragment of claim 1, wherein therecombinant vector is capable of directing expression of the DNAsequence in a compatible unicellular host organism.
 37. The recombinantvector of claim 36 wherein the DNA sequence encodes to the peptide ofclaim
 9. 38. A host cell comprising the recombinant vector as set forthin claim
 36. 39. The host cell of 38 wherein the recombinant vectorcomprises a DNA sequence coding for the peptide of claim
 10. 40. Thehost cell of claim 39 wherein said host cell is E. coli, yeast, insectcells, mammalian cells, or plant.
 41. A process for producing thepolypeptide, protein or protein fragment of claim 1 the processcomprising (a) transforming a host cell with a recombinant vectorencoding the polypeptide, protein or protein fragment; (b) culturing thehost cell under conditions which permit the expression of thepolypeptide, protein or protein fragment; (c) lysing the host cell; and(d) purifying the polypeptide, protein or protein fragment or a portionthereof by metal ion affinity chromatography.
 42. The process of claim41 wherein the recombinant vector comprises a DNA sequence coding forthe polypeptide, protein or protein fragment, wherein the recombinantvector is capable of directing expression of the DNA sequence in acompatible host cell.
 43. The process of claim 42 wherein therecombinant vector comprises a DNA sequence coding for the peptide ofclaim 9 or
 10. 44. The process of claim 43 wherein the host cell is E.coli, yeast, insect cells, mammalian cells, or plants.
 45. Apolypeptide, protein or protein fragment represented by the formulaR₁-Sp₁-(His-Z₁-His-Arg-His-Z₂-His)_(t)-Sp₂—R₂, wherein(His-Z₁-His-Arg-His-Z₂-His) is a metal ion-affinity peptide, t is atleast 2, R₁ is hydrogen, a polypeptide, protein or protein fragment, Sp₁is a covalent bond or a spacer comprising at least one amino acidresidue, R₂ is hydrogen, a polypeptide, protein or protein fragment, Sp₂is a covalent bond or a spacer comprising at least one amino acidresidue, Z₁ is an amino acid residue selected from the group consistingof Ala, Arg, Asn, Asp, Gin, Glu, Ile, Lys, Phe, Pro, Ser, Thr, Trp, andVal, and Z₂ is an amino acid residue selected from the group consistingof Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, lie, Leu, Lys, Met, Pro, Ser,Thr, Tyr, and Val.
 46. The peptide of claim 45, wherein Z₁ is Asn and Z₂is Lys.
 47. The peptide of claim 45, wherein Z₁ is Lys and Z₂ is Gly.48. A polypeptide, protein or protein fragment represented by theformula R₁-Sp₁-[(His-Z₁-His-Arg-His-Z₂-His)-Sp₂]_(t)—R₂, wherein(His-Z₁-His-Arg-His-Z₂-His) is a metal ion-affinity peptide, t is atleast 2, R₁ is hydrogen, a polypeptide, protein or protein fragment, Sp₁is a covalent bond or a spacer comprising at least one amino acidresidue, R₂ is hydrogen, a polypeptide, protein or protein fragment, Sp₂is a covalent bond or a spacer comprising at least one amino acidresidue, Z₁ is an amino acid residue selected from the group consistingof Ala, Arg, Asn, Asp, Gln, Glu, lie, Lys, Phe, Pro, Ser, Thr, Trp, andVal, and Z₂ is an amino acid residue selected from the group consistingof Ala, Arg, Asn, Asp, Cys, Gin, Glu, Gly, lie, Leu, Lys, Met, Pro, Ser,Thr, Tyr, and Val; and each Sp₂ of the recombinant polypeptides,proteins or protein fragments may be the same or different.
 49. Thepeptide of claim 48, wherein Z₁ is Asn and Z₂ is Lys.
 50. The peptide ofclaim 48, wherein Z₁ is Lys and Z₂ is Gly.